acl acl2013 acl2013-158 knowledge-graph by maker-knowledge-mining
Source: pdf
Author: K. Tamsin Maxwell ; Jon Oberlander ; W. Bruce Croft
Abstract: Techniques that compare short text segments using dependency paths (or simply, paths) appear in a wide range of automated language processing applications including question answering (QA). However, few models in ad hoc information retrieval (IR) use paths for document ranking due to the prohibitive cost of parsing a retrieval collection. In this paper, we introduce a flexible notion of paths that describe chains of words on a dependency path. These chains, or catenae, are readily applied in standard IR models. Informative catenae are selected using supervised machine learning with linguistically informed features and compared to both non-linguistic terms and catenae selected heuristically with filters derived from work on paths. Automatically selected catenae of 1-2 words deliver significant performance gains on three TREC collections.
Reference: text
sentIndex sentText sentNum sentScore
1 edu l Abstract Techniques that compare short text segments using dependency paths (or simply, paths) appear in a wide range of automated language processing applications including question answering (QA). [sent-12, score-0.236]
2 However, few models in ad hoc information retrieval (IR) use paths for document ranking due to the prohibitive cost of parsing a retrieval collection. [sent-13, score-0.357]
3 In this paper, we introduce a flexible notion of paths that describe chains of words on a dependency path. [sent-14, score-0.243]
4 Informative catenae are selected using supervised machine learning with linguistically informed features and compared to both non-linguistic terms and catenae selected heuristically with filters derived from work on paths. [sent-16, score-1.866]
5 Automatically selected catenae of 1-2 words deliver significant performance gains on three TREC collections. [sent-17, score-0.894]
6 Much work on sentence similarity using dependency paths focuses on question answering (QA) where textual inference requires attention to linguistic detail. [sent-25, score-0.236]
7 In this paper, we explore a flexible application of dependency paths that overcomes this difficulty. [sent-29, score-0.243]
8 We reduce paths to chains of words called catenae (Osborne and Groß, 2012) that capture salient semantic content in an underspecified manner. [sent-30, score-1.105]
9 Moreover, catenae are compatible with a variety of existing IR models. [sent-33, score-0.894]
10 We hypothesize that catenae identify most units of salient knowledge in text. [sent-34, score-0.983]
11 To our knowledge, this paper is the first time that catenae are proposed as a means for term selection in IR, and where ellipsis is considered as a means for identification of semantic units. [sent-36, score-1.06]
12 Previous heuristic filters for dependency paths (Lin and Pantel, 2001 ; Shen et al. [sent-38, score-0.287]
13 Alternatively, treating all paths as equally informative (Punyakanok et al. [sent-41, score-0.187]
14 The challenge of path selection is that no explicit information in text indicates which paths are relevant. [sent-44, score-0.227]
15 Consider the catenae captured by heuristic filters for the TREC1 query, ‘What role does blood-alcohol level play in automobile accident fatalities’ (#358, Table 1). [sent-45, score-1.008]
16 and ‘level play’ do not have an important semantic relationship relative to the query, yet these catenae are described by parent-child relations that are commonly used to filter paths in text processing applications. [sent-52, score-1.077]
17 Alternative filters that avoid such trivial word combinations also omit descriptions of key entities such as ‘blood alcohol’, and identify longer catenae that may be overly restrictive. [sent-53, score-0.972]
18 These shortcomings suggest that an optimized selection process may improve performance of techniques that use dependency paths in ad hoc IR. [sent-54, score-0.403]
19 We identify three previously proposed selection methods, and compare them on the task of catenae selection for ad hoc IR. [sent-55, score-1.127]
20 We also develop a linguistically informed machine learning technique for catenae selection that captures both key aspects of heuristic filters, and novel characteristics of catenae and paths. [sent-58, score-1.86]
21 The basic idea is that selection, or weighting, of catenae can be improved by features that are specific to paths, rather than generic for all terms. [sent-59, score-0.894]
22 Results show that our selection method is more effective in identifying key catenae compared to previously proposed filters. [sent-60, score-0.934]
23 Integration of the identified catenae in queries also improves IR effectiveness compared to a highly effective baseline that uses sequential bigrams with no linguistic knowledge. [sent-61, score-0.985]
24 This model represents the obvious alternative to catenae for term selection in IR. [sent-62, score-0.934]
25 §2 reviews related work, §3 describes catenae §and their linguistic motivat§ion and §4 describes our selection method. [sent-64, score-0.934]
26 2 Related work Techniques that compare short text segments using dependency paths are applied to a wide range of automated language processing tasks, including paraphrasing, summarization, entailment detection, QA, machine translation and the evaluation of word, phrase and sentence similarity. [sent-68, score-0.223]
27 A generic approach uses a matching function to compare a dependency path between any two stemmed terms x and y in a sentence A with any dependency path between x and y in sentence B. [sent-69, score-0.208]
28 The match score for A and B is computed over all dependency paths in A. [sent-70, score-0.223]
29 For example, Lin and Pantel (2001) present a method to derive paraphrasing rules for QA using analysis of paths that connect two nouns; Echihabi and Marcu (2003) align all paths in questions with trees for heuristically pruned answers; Cui et al. [sent-73, score-0.306]
30 (2007) use quasi-synchronous translation to map all parent-child paths in a question to any path in an answer; and Moschitti (2008) explores syntactic and semantic kernels for QA classification. [sent-75, score-0.227]
31 , 2007) that use dependency paths to address long-distance dependencies and normalize spurious differences in surface text. [sent-81, score-0.223]
32 Techniques using dependency paths in both QA and ad hoc IR show promising results, but there is no clear understanding of which path constraints result in the greatest IR effectiveness. [sent-90, score-0.397]
33 We directly compare selections of catenae as a simplified representation of paths. [sent-91, score-0.91]
34 In addition, a vast number of methods have been presented for term weighting and selection in ad hoc IR. [sent-92, score-0.18]
35 Our supervised selection extends the successful method presented by Bendersky and Croft (2008) for selection and weighting of query noun phrases (NPs). [sent-93, score-0.195]
36 In contrast to this work, we apply linguistic features that are specific to catenae and dependency paths, and select among units containing more than two content-bearing words. [sent-96, score-0.995]
37 A catena is defined on a dependency graph that has lexical nodes (or words) linked by binary asymmetrical relations called dependencies. [sent-99, score-0.196]
38 A catena is a word, or sequence of words that are continuous with respect to a walk on a dependency First conjunct: Antecedent clause Is Second conjunct: Elliptical/target clause polio under control Antecedent in China, and is polio under control Elided text in India? [sent-102, score-0.527]
39 1 shows a dependency parse that generates 21 catenae in total: (using ifor Xi) 1, 2, 3, 4, 5, 6, 12, 23, 34, 45, 56, 123, 234, 345, 456, 1234, 2345, 3456, 12345, 23456, 123456. [sent-106, score-0.991]
40 We process catenae to remove stop words on the INQUERY stoplist (Allan et al. [sent-107, score-0.894]
41 This results in a reduced set of catenae as shown in Fig. [sent-109, score-0.894]
42 A catena is an economical, intuitive lexical unit that corresponds to a dependency path and is argued to play an important role in syntax (Osborne et al. [sent-113, score-0.23]
43 In this paper, we explore catenae instead of paths for ad hoc IR due to their suitability for efficient IR models and flexible representation of language semantics. [sent-115, score-1.22]
44 Specifically, we note that catenae identify words that can be omitted in elliptical constructions (Osborne et al. [sent-116, score-0.986]
45 To clarify this insight, we briefly review catenae in ellipsis. [sent-119, score-0.894]
46 In contrast, the omitted words in successful ellipsis do form catenae, and they represent informative word combinations with respect to the query. [sent-145, score-0.225]
47 This observation leads us to an ellipsis hypothesis: Ellipsis hypothesis: For queries formulated into coordinated structures, the subset of catenae that are elliptical candidates identify the salient semantic units in the query. [sent-146, score-1.24]
48 2 Limitations of paths and catenae The prediction of salient semantic units by catenae is quite robust. [sent-148, score-2.03]
49 However, there are two problems that can limit the effectiveness of any technique that uses catenae or dependency paths in IR. [sent-149, score-1.134]
50 The elided words ‘polio in china’ are relevant to a base query, ‘Is polio under control in China? [sent-154, score-0.218]
51 2) Rising: Automatic extraction of catenae is limited by the phenomenon of rising. [sent-163, score-0.894]
52 Let the X1X2X3X4X5X6X7 used a toxic chemical as a weapon Standard structure X1X2X3X4gX5X6X7 A toxic chemical used as Rising structure a weapon Figure 5: A parse with and without rising. [sent-164, score-0.191]
53 The dashed dependency edge marks where a head is not also the governor and the g-script marks the governor of the risen catena. [sent-165, score-0.208]
54 governor of a catena be the word that licenses it (in Fig. [sent-166, score-0.184]
55 More specifically, rising occurs when a catena is separated from its governor by words that its governor does not dominate, or the catena dominates the governor, as in Fig. [sent-175, score-0.382]
56 4 Selection method for catenae Catenae describe relatively few of the possible word combinations in a sentence, but still include many combinations that do not result in successful ellipsis and are not informative for IR. [sent-178, score-1.099]
57 Candidate catenae are identified using two constraints that enable more efficient extraction: stopwords are removed, and stopped catenae must contain fewer than four words (single words are permitted). [sent-180, score-1.801]
58 We use a pseudo-projective joint dependency parse and semantic role labelling system (Johansson and 510 Nugues, 2008) to generate the dependency parse. [sent-181, score-0.18]
59 For comparison, catenae extracted from 500 queries using the Stanford dependency parser (de Marneffe et al. [sent-184, score-1.019]
60 , 2006) overlap with 77% of catenae extracted from the same queries using the applied parser. [sent-185, score-0.949]
61 1 Feature Classes Four feature classes are presented in Table 2: Ellipsis candidates: The ellipsis hypothesis suggests that informative catenae are elliptical candidates. [sent-187, score-1.081]
62 To enable extraction of characteristic features we (a) construct a coordinated query by adding the query to itself; and (b) elide catenae from the second conjunct. [sent-189, score-1.084]
63 we have: (a) Is polio under control in China, and is polio under control in China? [sent-191, score-0.348]
64 (b) Is polio under control in China, and is polio in China? [sent-192, score-0.313]
65 In addition, variability in the separation distance in documents observed for words that have governor-dependent relations in queries has been proposed for identification of promising paths (Song et al. [sent-196, score-0.244]
66 We also observe that due to the phenomenon of rising, words that form catenae can be discontinuous in text, and the ability of catenae to match similar word combinations is limited by variability of how they appear in documents. [sent-198, score-1.839]
67 to normalize co-occurrence counts for catenae of different lengths: a factor |c| , where |c| is the number of words in catena c (Hagen et al|. [sent-202, score-1.003]
68 1 Classification Catenae selection is framed as a supervised classification problem trained on binary human judgments of informativeness: how well catenae represent a query and discriminate between relevant and non-relevant documents in a collection. [sent-212, score-1.029]
69 Kappa for two annotators on catenae in 100 sample queries was 0. [sent-213, score-0.949]
70 5 decision trees as base learners to classify catenae as informative or not. [sent-220, score-0.928]
71 There are roughly three times the number of uninformative catenae compared to informative catenae. [sent-228, score-0.942]
72 Any examples that replicate catenae in the test collection are excluded. [sent-248, score-0.894]
73 One potential explanation is that Robust04 queries are longer on average (up to 32 content words per query, compared to up to 16 words) so they generate a more diverse set of catenae that are more easily distinguished with respect to informativeness. [sent-257, score-0.949]
74 For example, the query new york city in Indri4 query language is: #weight( λ1 #combine(new york city) λ2 #combine(#1(new york) #1(york city)) λ3 #combine(#uw8(new york) #uw8(york city))) SD is a competitive baseline in IR (Bendersky and Croft, 2008; Park et al. [sent-269, score-0.267]
75 Our reformulated model uses the same query format as SD, but the second and third cliques contain filtered catenae instead of query bigrams. [sent-272, score-1.085]
76 In addition, because catenae may be multi-word units, we adjust the unordered window size to 4 ∗ |c| . [sent-273, score-0.894]
77 So, if two catenae ‘york’ and ‘new york cit4y’ ∗ are selected, the last clique has the form: λ3 #combine( york #uw12(new york city)) This query representation enables word relations to be explicitly indicated while maintaining efficient and flexible matching of catenae in documents. [sent-274, score-2.108]
78 2 Baseline catenae selection We explore four filters for catenae. [sent-277, score-0.98]
79 , 2005), semantic roles (Moschitti, 2008), and all dependency paths except those that occur between words in the same base chunk (e. [sent-284, score-0.236]
80 3 Experiments Experiments compare queries reformulated using catenae selected by baseline filters and our supervised selection method (SFeat) to SD and a bag-of-words model (QL). [sent-307, score-1.074]
81 We also compare IR effectiveness of all catenae filtered using SFeat with approaches that combine SFeat with baseline filters. [sent-308, score-0.924]
82 4 Results Results in Table 4 show significant improvement in mean average precision (MAP) of queries using catenae compared to QL. [sent-312, score-0.949]
83 Consistent improvements over SD are also demonstrated for supervised selection applied to all catenae (SFeat) and catenae with only 1-2 words (SF-12) across all collections (Table 5). [sent-313, score-1.875]
84 Governor-dependent relations for WT10G are an exception and we speculate that this is due to a negative influence of 3word catenae for this collection. [sent-316, score-0.911]
85 This means that 3-word catenae (in all models except GovDep) tend to include uninformative words, such as ‘reasons’ in ‘fasting religious reasons’ . [sent-320, score-0.908]
86 In contrast, 3-word catenae in other collections tend to identify query subconcepts or phrases, such as ‘science plants water’ . [sent-321, score-1.013]
87 Classification results for catenae separated by length, such that the classifier for catenae with a specific length are trained on examples of catenae with the same length, confirm this intuition. [sent-322, score-2.682]
88 The rejection rate for 3-word catenae is twice as high for WT10G as for other collections. [sent-323, score-0.894]
89 It is also more difficult to distinguish informative 3-word catenae compared to catenae with 1-2 words. [sent-324, score-1.822]
90 The SF-12 model combines catenae predicted for lengths 1 and 2. [sent-326, score-0.894]
91 Its strong performance across all collections suggests that most of the benefit derived from catenae in IR is found in governor-dependent and single word units, where single words are important (GovDep uses only 2-word catenae). [sent-327, score-0.923]
92 Finally, we review selected catenae for queries that perform significantly better or worse than SD (> 75% change in MAP). [sent-331, score-0.949]
93 The best IR effectiveness occurs when selected catenae clearly focus on the most important aspect of a query. [sent-332, score-0.911]
94 Poor perfor514 mance is caused by a lack of focus in a catenae set, even though selected catenae are reasonable, or an emphasis on words that are not central to the query. [sent-333, score-1.788]
95 The latter can occur when words that are not essential to query semantics appear in many catenae due to their position in the dependency graph. [sent-334, score-1.041]
96 7 Conclusion We presented a flexible implementation of dependency paths for long queries in ad hoc IR that does not require dependency parsing a collection. [sent-335, score-0.508]
97 Our supervised selection technique for catenae addresses the need to balance a representation of language expressiveness with effective, efficient statistical methods. [sent-336, score-0.965]
98 It is not possible to directly compare performance of our approach with ad hoc techniques in IR that parse a retrieval collection. [sent-338, score-0.199]
99 However, we note that a recent result using query translation based on dependency paths (Park et al. [sent-339, score-0.3]
100 We conclude that catenae do not replace path-based techniques, but may offer some insight into their application, and have particular value when it is not practical to parse target documents to determine text similarity. [sent-342, score-0.921]
wordName wordTfidf (topN-words)
[('catenae', 0.894), ('paths', 0.153), ('polio', 0.139), ('ellipsis', 0.113), ('ir', 0.112), ('catena', 0.109), ('hoc', 0.082), ('query', 0.077), ('dependency', 0.07), ('governor', 0.06), ('sfeat', 0.06), ('ad', 0.058), ('queries', 0.055), ('govdep', 0.05), ('clique', 0.049), ('york', 0.048), ('filters', 0.046), ('croft', 0.045), ('salient', 0.045), ('elided', 0.044), ('rising', 0.044), ('gro', 0.044), ('ql', 0.04), ('selection', 0.04), ('elliptical', 0.04), ('bruce', 0.04), ('omitted', 0.039), ('sigir', 0.038), ('sd', 0.038), ('osborne', 0.037), ('park', 0.036), ('china', 0.036), ('coordinated', 0.036), ('control', 0.035), ('path', 0.034), ('ny', 0.034), ('informative', 0.034), ('dependence', 0.033), ('chemical', 0.032), ('retrieval', 0.032), ('qa', 0.032), ('units', 0.031), ('nomend', 0.03), ('collections', 0.029), ('song', 0.029), ('parse', 0.027), ('toxic', 0.026), ('bendersky', 0.026), ('metzler', 0.024), ('weapon', 0.024), ('cui', 0.022), ('punyakanok', 0.022), ('reformulated', 0.021), ('successful', 0.02), ('freund', 0.02), ('fatalities', 0.02), ('inquery', 0.02), ('maisonnasse', 0.02), ('flexible', 0.02), ('variability', 0.019), ('combinations', 0.019), ('bigrams', 0.019), ('supervised', 0.018), ('schapire', 0.018), ('heuristic', 0.018), ('acm', 0.018), ('accident', 0.018), ('risen', 0.018), ('srikanth', 0.018), ('city', 0.017), ('play', 0.017), ('effectiveness', 0.017), ('gao', 0.017), ('relations', 0.017), ('trec', 0.016), ('moschitti', 0.016), ('cikm', 0.016), ('conjunct', 0.016), ('selections', 0.016), ('cliques', 0.016), ('incoherent', 0.016), ('qualify', 0.016), ('verbose', 0.015), ('automobile', 0.015), ('licenses', 0.015), ('surdeanu', 0.015), ('syntactic', 0.014), ('uninformative', 0.014), ('informed', 0.014), ('answers', 0.014), ('quasisynchronous', 0.013), ('antecedent', 0.013), ('semantic', 0.013), ('efficient', 0.013), ('identify', 0.013), ('question', 0.013), ('combine', 0.013), ('discontinuous', 0.013), ('pantel', 0.012), ('predictions', 0.012)]
simIndex simValue paperId paperTitle
same-paper 1 0.99999982 158 acl-2013-Feature-Based Selection of Dependency Paths in Ad Hoc Information Retrieval
Author: K. Tamsin Maxwell ; Jon Oberlander ; W. Bruce Croft
Abstract: Techniques that compare short text segments using dependency paths (or simply, paths) appear in a wide range of automated language processing applications including question answering (QA). However, few models in ad hoc information retrieval (IR) use paths for document ranking due to the prohibitive cost of parsing a retrieval collection. In this paper, we introduce a flexible notion of paths that describe chains of words on a dependency path. These chains, or catenae, are readily applied in standard IR models. Informative catenae are selected using supervised machine learning with linguistically informed features and compared to both non-linguistic terms and catenae selected heuristically with filters derived from work on paths. Automatically selected catenae of 1-2 words deliver significant performance gains on three TREC collections.
2 0.099646233 60 acl-2013-Automatic Coupling of Answer Extraction and Information Retrieval
Author: Xuchen Yao ; Benjamin Van Durme ; Peter Clark
Abstract: Information Retrieval (IR) and Answer Extraction are often designed as isolated or loosely connected components in Question Answering (QA), with repeated overengineering on IR, and not necessarily performance gain for QA. We propose to tightly integrate them by coupling automatically learned features for answer extraction to a shallow-structured IR model. Our method is very quick to implement, and significantly improves IR for QA (measured in Mean Average Precision and Mean Reciprocal Rank) by 10%-20% against an uncoupled retrieval baseline in both document and passage retrieval, which further leads to a downstream 20% improvement in QA F1.
3 0.06655141 55 acl-2013-Are Semantically Coherent Topic Models Useful for Ad Hoc Information Retrieval?
Author: Romain Deveaud ; Eric SanJuan ; Patrice Bellot
Abstract: The current topic modeling approaches for Information Retrieval do not allow to explicitly model query-oriented latent topics. More, the semantic coherence of the topics has never been considered in this field. We propose a model-based feedback approach that learns Latent Dirichlet Allocation topic models on the top-ranked pseudo-relevant feedback, and we measure the semantic coherence of those topics. We perform a first experimental evaluation using two major TREC test collections. Results show that retrieval perfor- mances tend to be better when using topics with higher semantic coherence.
4 0.053940155 273 acl-2013-Paraphrasing Adaptation for Web Search Ranking
Author: Chenguang Wang ; Nan Duan ; Ming Zhou ; Ming Zhang
Abstract: Mismatch between queries and documents is a key issue for the web search task. In order to narrow down such mismatch, in this paper, we present an in-depth investigation on adapting a paraphrasing technique to web search from three aspects: a search-oriented paraphrasing model; an NDCG-based parameter optimization algorithm; an enhanced ranking model leveraging augmented features computed on paraphrases of original queries. Ex- periments performed on the large scale query-document data set show that, the search performance can be significantly improved, with +3.28% and +1.14% NDCG gains on dev and test sets respectively.
5 0.052480515 99 acl-2013-Crowd Prefers the Middle Path: A New IAA Metric for Crowdsourcing Reveals Turker Biases in Query Segmentation
Author: Rohan Ramanath ; Monojit Choudhury ; Kalika Bali ; Rishiraj Saha Roy
Abstract: Query segmentation, like text chunking, is the first step towards query understanding. In this study, we explore the effectiveness of crowdsourcing for this task. Through carefully designed control experiments and Inter Annotator Agreement metrics for analysis of experimental data, we show that crowdsourcing may not be a suitable approach for query segmentation because the crowd seems to have a very strong bias towards dividing the query into roughly equal (often only two) parts. Similarly, in the case of hierarchical or nested segmentation, turkers have a strong preference towards balanced binary trees.
6 0.050024934 291 acl-2013-Question Answering Using Enhanced Lexical Semantic Models
7 0.04258069 159 acl-2013-Filling Knowledge Base Gaps for Distant Supervision of Relation Extraction
8 0.041617651 183 acl-2013-ICARUS - An Extensible Graphical Search Tool for Dependency Treebanks
9 0.040450964 272 acl-2013-Paraphrase-Driven Learning for Open Question Answering
10 0.040122576 290 acl-2013-Question Analysis for Polish Question Answering
11 0.040071454 224 acl-2013-Learning to Extract International Relations from Political Context
12 0.039667323 28 acl-2013-A Unified Morpho-Syntactic Scheme of Stanford Dependencies
13 0.03511931 231 acl-2013-Linggle: a Web-scale Linguistic Search Engine for Words in Context
14 0.032994933 343 acl-2013-The Effect of Higher-Order Dependency Features in Discriminative Phrase-Structure Parsing
15 0.032803863 208 acl-2013-Joint Inference for Heterogeneous Dependency Parsing
16 0.0327526 268 acl-2013-PATHS: A System for Accessing Cultural Heritage Collections
17 0.032414563 300 acl-2013-Reducing Annotation Effort for Quality Estimation via Active Learning
18 0.032019004 19 acl-2013-A Shift-Reduce Parsing Algorithm for Phrase-based String-to-Dependency Translation
19 0.031650946 98 acl-2013-Cross-lingual Transfer of Semantic Role Labeling Models
20 0.030718079 365 acl-2013-Understanding Tables in Context Using Standard NLP Toolkits
topicId topicWeight
[(0, 0.093), (1, 0.007), (2, -0.019), (3, -0.042), (4, 0.007), (5, 0.013), (6, 0.02), (7, -0.076), (8, 0.025), (9, -0.007), (10, 0.024), (11, 0.004), (12, -0.009), (13, 0.027), (14, 0.008), (15, 0.03), (16, -0.014), (17, -0.011), (18, 0.003), (19, -0.004), (20, 0.003), (21, 0.024), (22, -0.031), (23, -0.011), (24, 0.006), (25, -0.003), (26, -0.038), (27, -0.016), (28, 0.002), (29, 0.003), (30, 0.0), (31, 0.004), (32, -0.103), (33, -0.012), (34, 0.039), (35, -0.051), (36, 0.072), (37, -0.008), (38, 0.015), (39, 0.027), (40, 0.027), (41, -0.009), (42, 0.0), (43, 0.008), (44, -0.014), (45, -0.036), (46, -0.003), (47, -0.01), (48, -0.035), (49, -0.042)]
simIndex simValue paperId paperTitle
same-paper 1 0.85494435 158 acl-2013-Feature-Based Selection of Dependency Paths in Ad Hoc Information Retrieval
Author: K. Tamsin Maxwell ; Jon Oberlander ; W. Bruce Croft
Abstract: Techniques that compare short text segments using dependency paths (or simply, paths) appear in a wide range of automated language processing applications including question answering (QA). However, few models in ad hoc information retrieval (IR) use paths for document ranking due to the prohibitive cost of parsing a retrieval collection. In this paper, we introduce a flexible notion of paths that describe chains of words on a dependency path. These chains, or catenae, are readily applied in standard IR models. Informative catenae are selected using supervised machine learning with linguistically informed features and compared to both non-linguistic terms and catenae selected heuristically with filters derived from work on paths. Automatically selected catenae of 1-2 words deliver significant performance gains on three TREC collections.
2 0.72422725 273 acl-2013-Paraphrasing Adaptation for Web Search Ranking
Author: Chenguang Wang ; Nan Duan ; Ming Zhou ; Ming Zhang
Abstract: Mismatch between queries and documents is a key issue for the web search task. In order to narrow down such mismatch, in this paper, we present an in-depth investigation on adapting a paraphrasing technique to web search from three aspects: a search-oriented paraphrasing model; an NDCG-based parameter optimization algorithm; an enhanced ranking model leveraging augmented features computed on paraphrases of original queries. Ex- periments performed on the large scale query-document data set show that, the search performance can be significantly improved, with +3.28% and +1.14% NDCG gains on dev and test sets respectively.
3 0.64619601 183 acl-2013-ICARUS - An Extensible Graphical Search Tool for Dependency Treebanks
Author: Markus Gartner ; Gregor Thiele ; Wolfgang Seeker ; Anders Bjorkelund ; Jonas Kuhn
Abstract: We present ICARUS, a versatile graphical search tool to query dependency treebanks. Search results can be inspected both quantitatively and qualitatively by means of frequency lists, tables, or dependency graphs. ICARUS also ships with plugins that enable it to interface with tool chains running either locally or remotely.
4 0.59365904 272 acl-2013-Paraphrase-Driven Learning for Open Question Answering
Author: Anthony Fader ; Luke Zettlemoyer ; Oren Etzioni
Abstract: We study question answering as a machine learning problem, and induce a function that maps open-domain questions to queries over a database of web extractions. Given a large, community-authored, question-paraphrase corpus, we demonstrate that it is possible to learn a semantic lexicon and linear ranking function without manually annotating questions. Our approach automatically generalizes a seed lexicon and includes a scalable, parallelized perceptron parameter estimation scheme. Experiments show that our approach more than quadruples the recall of the seed lexicon, with only an 8% loss in precision.
5 0.58975464 271 acl-2013-ParaQuery: Making Sense of Paraphrase Collections
Author: Lili Kotlerman ; Nitin Madnani ; Aoife Cahill
Abstract: Pivoting on bilingual parallel corpora is a popular approach for paraphrase acquisition. Although such pivoted paraphrase collections have been successfully used to improve the performance of several different NLP applications, it is still difficult to get an intrinsic estimate of the quality and coverage of the paraphrases contained in these collections. We present ParaQuery, a tool that helps a user interactively explore and characterize a given pivoted paraphrase collection, analyze its utility for a particular domain, and compare it to other popular lexical similarity resources all within a single interface.
6 0.58416349 290 acl-2013-Question Analysis for Polish Question Answering
8 0.55284065 60 acl-2013-Automatic Coupling of Answer Extraction and Information Retrieval
9 0.54547948 231 acl-2013-Linggle: a Web-scale Linguistic Search Engine for Words in Context
10 0.54515672 291 acl-2013-Question Answering Using Enhanced Lexical Semantic Models
11 0.48005161 55 acl-2013-Are Semantically Coherent Topic Models Useful for Ad Hoc Information Retrieval?
12 0.47584242 225 acl-2013-Learning to Order Natural Language Texts
13 0.47522324 268 acl-2013-PATHS: A System for Accessing Cultural Heritage Collections
14 0.47195458 159 acl-2013-Filling Knowledge Base Gaps for Distant Supervision of Relation Extraction
15 0.46159193 218 acl-2013-Latent Semantic Tensor Indexing for Community-based Question Answering
16 0.45497668 332 acl-2013-Subtree Extractive Summarization via Submodular Maximization
17 0.4543865 215 acl-2013-Large-scale Semantic Parsing via Schema Matching and Lexicon Extension
18 0.45277599 94 acl-2013-Coordination Structures in Dependency Treebanks
19 0.44759625 364 acl-2013-Typesetting for Improved Readability using Lexical and Syntactic Information
20 0.44377649 367 acl-2013-Universal Conceptual Cognitive Annotation (UCCA)
topicId topicWeight
[(0, 0.039), (6, 0.044), (11, 0.058), (19, 0.013), (24, 0.054), (26, 0.044), (35, 0.127), (42, 0.041), (48, 0.033), (64, 0.042), (66, 0.204), (70, 0.042), (88, 0.041), (90, 0.025), (95, 0.052)]
simIndex simValue paperId paperTitle
same-paper 1 0.8157773 158 acl-2013-Feature-Based Selection of Dependency Paths in Ad Hoc Information Retrieval
Author: K. Tamsin Maxwell ; Jon Oberlander ; W. Bruce Croft
Abstract: Techniques that compare short text segments using dependency paths (or simply, paths) appear in a wide range of automated language processing applications including question answering (QA). However, few models in ad hoc information retrieval (IR) use paths for document ranking due to the prohibitive cost of parsing a retrieval collection. In this paper, we introduce a flexible notion of paths that describe chains of words on a dependency path. These chains, or catenae, are readily applied in standard IR models. Informative catenae are selected using supervised machine learning with linguistically informed features and compared to both non-linguistic terms and catenae selected heuristically with filters derived from work on paths. Automatically selected catenae of 1-2 words deliver significant performance gains on three TREC collections.
2 0.75592154 148 acl-2013-Exploring Sentiment in Social Media: Bootstrapping Subjectivity Clues from Multilingual Twitter Streams
Author: Svitlana Volkova ; Theresa Wilson ; David Yarowsky
Abstract: We study subjective language media and create Twitter-specific lexicons via bootstrapping sentiment-bearing terms from multilingual Twitter streams. Starting with a domain-independent, highprecision sentiment lexicon and a large pool of unlabeled data, we bootstrap Twitter-specific sentiment lexicons, using a small amount of labeled data to guide the process. Our experiments on English, Spanish and Russian show that the resulting lexicons are effective for sentiment classification for many underexplored languages in social media.
3 0.69645733 214 acl-2013-Language Independent Connectivity Strength Features for Phrase Pivot Statistical Machine Translation
Author: Ahmed El Kholy ; Nizar Habash ; Gregor Leusch ; Evgeny Matusov ; Hassan Sawaf
Abstract: An important challenge to statistical machine translation (SMT) is the lack of parallel data for many language pairs. One common solution is to pivot through a third language for which there exist parallel corpora with the source and target languages. Although pivoting is a robust technique, it introduces some low quality translations. In this paper, we present two language-independent features to improve the quality of phrase-pivot based SMT. The features, source connectivity strength and target connectivity strength reflect the quality of projected alignments between the source and target phrases in the pivot phrase table. We show positive results (0.6 BLEU points) on Persian-Arabic SMT as a case study.
4 0.66406029 22 acl-2013-A Structured Distributional Semantic Model for Event Co-reference
Author: Kartik Goyal ; Sujay Kumar Jauhar ; Huiying Li ; Mrinmaya Sachan ; Shashank Srivastava ; Eduard Hovy
Abstract: In this paper we present a novel approach to modelling distributional semantics that represents meaning as distributions over relations in syntactic neighborhoods. We argue that our model approximates meaning in compositional configurations more effectively than standard distributional vectors or bag-of-words models. We test our hypothesis on the problem of judging event coreferentiality, which involves compositional interactions in the predicate-argument structure of sentences, and demonstrate that our model outperforms both state-of-the-art window-based word embeddings as well as simple approaches to compositional semantics pre- viously employed in the literature.
5 0.66213208 228 acl-2013-Leveraging Domain-Independent Information in Semantic Parsing
Author: Dan Goldwasser ; Dan Roth
Abstract: Semantic parsing is a domain-dependent process by nature, as its output is defined over a set of domain symbols. Motivated by the observation that interpretation can be decomposed into domain-dependent and independent components, we suggest a novel interpretation model, which augments a domain dependent model with abstract information that can be shared by multiple domains. Our experiments show that this type of information is useful and can reduce the annotation effort significantly when moving between domains.
6 0.65491748 159 acl-2013-Filling Knowledge Base Gaps for Distant Supervision of Relation Extraction
7 0.65213436 60 acl-2013-Automatic Coupling of Answer Extraction and Information Retrieval
9 0.6483928 172 acl-2013-Graph-based Local Coherence Modeling
10 0.64431483 215 acl-2013-Large-scale Semantic Parsing via Schema Matching and Lexicon Extension
11 0.64334053 121 acl-2013-Discovering User Interactions in Ideological Discussions
12 0.6425662 238 acl-2013-Measuring semantic content in distributional vectors
13 0.64191031 126 acl-2013-Diverse Keyword Extraction from Conversations
14 0.64150256 283 acl-2013-Probabilistic Domain Modelling With Contextualized Distributional Semantic Vectors
15 0.64072144 4 acl-2013-A Context Free TAG Variant
16 0.64041322 185 acl-2013-Identifying Bad Semantic Neighbors for Improving Distributional Thesauri
18 0.63987339 2 acl-2013-A Bayesian Model for Joint Unsupervised Induction of Sentiment, Aspect and Discourse Representations
19 0.63861215 58 acl-2013-Automated Collocation Suggestion for Japanese Second Language Learners
20 0.63788474 99 acl-2013-Crowd Prefers the Middle Path: A New IAA Metric for Crowdsourcing Reveals Turker Biases in Query Segmentation