acl acl2013 acl2013-222 knowledge-graph by maker-knowledge-mining

222 acl-2013-Learning Semantic Textual Similarity with Structural Representations


Source: pdf

Author: Aliaksei Severyn ; Massimo Nicosia ; Alessandro Moschitti

Abstract: Measuring semantic textual similarity (STS) is at the cornerstone of many NLP applications. Different from the majority of approaches, where a large number of pairwise similarity features are used to represent a text pair, our model features the following: (i) it directly encodes input texts into relational syntactic structures; (ii) relies on tree kernels to handle feature engineering automatically; (iii) combines both structural and feature vector representations in a single scoring model, i.e., in Support Vector Regression (SVR); and (iv) delivers significant improvement over the best STS systems.

Reference: text


Summary: the most important sentenses genereted by tfidf model

sentIndex sentText sentNum sentScore

1 CnRiIc, Qatar ,Fmoounscdahtiiotn,t Dio}@hda, Qatar amo s chitt i qf . [sent-5, score-0.057]

2 qa @ Abstract Measuring semantic textual similarity (STS) is at the cornerstone of many NLP applications. [sent-7, score-0.22]

3 1 Introduction In STS the goal is to learn a scoring model that given a pair of two short texts returns a similar- ity score that correlates with human judgement. [sent-11, score-0.139]

4 Hence, the key aspect of having an accurate STS framework is the design of features that can adequately represent various aspects of the similarity between texts, e. [sent-12, score-0.162]

5 The majority of approaches treat input text pairs as feature vectors where each feature is a score corresponding to a certain type of similarity. [sent-15, score-0.28]

6 This approach is conceptually easy to implement and the STS shared task at SemEval 2012 (Agirre et al. [sent-16, score-0.051]

7 , a number of features encoding similarity of an input text pair were combined in a single scoring model, e. [sent-19, score-0.35]

8 Nevertheless, one limitation of using only similarity features to represent a text pair is that of low representation power. [sent-22, score-0.296]

9 The novelty of our approach is that we treat the input text pairs as structural objects and rely on the power of kernel learning to extract relevant structures. [sent-23, score-0.661]

10 To link the documents in a pair we mark the nodes in the related structures with a special rela- tional tag. [sent-24, score-0.212]

11 This way effective structural relational patterns are implicitly encoded in the trees and can be automatically learned by the kernel-based machines. [sent-25, score-0.604]

12 We combine our relational structural model with the features from two best systems of STS-2012. [sent-26, score-0.543]

13 Finally, we use the approach of classifier stacking to combine several structural models into the feature vector representation. [sent-27, score-0.493]

14 2 Structural Relational Similarity The approach of relating pairs of input structures by learning predictable syntactic transformations has shown to deliver state-of-the-art results in question answering, recognizing textual entailment, and paraphrase detection, e. [sent-31, score-0.357]

15 applying quasi-synchronous grammar formalism and variations of tree edit distance alignments, to extract syntactic patterns relating pairs of input structures. [sent-37, score-0.294]

16 Our approach is conceptually simpler, as it regards the problem within the kernel learning framework, where we first encode salient syntactic/semantic proper714 Proce dingSsof oifa, th Beu 5l1gsarti Aan,An u aglu Mste 4e-ti9n2g 0 o1f3 t. [sent-38, score-0.255]

17 c A2s0s1o3ci Aatsiosonc fioartio Cno fmorpu Ctoamtiopnuatalt Lioin gauli Lsitnicgsu,i psatgices 714–718, ties of the input text pairs into tree structures and rely on tree kernels to automatically generate rich feature spaces. [sent-40, score-0.728]

18 , 2007; Moschitti and Quarteroni, 2008), in textual entailment recognition, e. [sent-44, score-0.103]

19 , (Moschitti and Zanzotto, 2007), and more in general in relational text categorization (Moschitti, 2008; Severyn and Moschitti, 2012). [sent-46, score-0.219]

20 In this section we describe: (i) a kernel framework to combine structural and vector models; (ii) structural kernels to handle feature engineering; and (iii) suitable structural representations for relational learning. [sent-47, score-1.688]

21 1 Structural Kernel Learning In supervised learning, given labeled data {(xx x i, yy i)}in=1, the goal is to estimate a decision f{u( xxx nction) h(xx x ) = yy that maps input examples to their targets. [sent-49, score-0.25]

22 A conventional approach is to represent a pair of texts as a set of similarity features {fi}, s. [sent-50, score-0.255]

23 the predictions are computed as h(xx x ) = w ww ·x x x = Piwifi, wionhsere a w ww ciso mthpeu mtedod aesl weight vweww ct ·o xxx r. [sent-52, score-0.215]

24 HePnce, the learning problem boils down to estimating individual weights of each of the similarity features fi. [sent-53, score-0.162]

25 One downside of such approach is that a great deal of similarity information encoded in a given text pair is lost when modeled by single real-valued scores. [sent-54, score-0.259]

26 A more versatile approach in terms of the input representation relies on kernels. [sent-55, score-0.137]

27 , SVM, the prediction function for Pa test input xx takes on the following form h(xx x ) = Pi αiyiK(x xx , xxi), where αi are the model parametPers estimated from the training data, yi are target vaPriables, xx i are support vectors, and K(·, ·) is a kernel function. [sent-58, score-1.12]

28 a To encode both structural representation and similarity feature vectors of a given text pair in a single model we define each document in a pair to be composed of a tree and a vector: htt t , vv i . [sent-59, score-0.97]

29 Each of the ker- ered: K( xx xi, xx xj) = nel computations K can be broken down into the following: = + Kfvec(v vv (1), vv v(2)), where KTK computes a structural kernel and Kfvec is a kernel over feature vec- K(x xx (1),x xx(2)) KTK( tt t(1), t tt(2)) tors, e. [sent-61, score-1.782]

30 Further in the text we refer to structural tree kernel models as TK and explicit feature vector representation as fve c. [sent-64, score-0.816]

31 Having defined a way to jointly model text pairs using structural TK representations along with the similarity features fvec, we next briefly review tree kernels and our relational structures. [sent-65, score-1.149]

32 2 Tree Kernels We use tree structures as our base representation since they provide sufficient flexibility in representation and allow for easier feature extraction than, for example, graph structures. [sent-67, score-0.372]

33 Hence, we rely on tree kernels to compute KTK(·, ·). [sent-68, score-0.328]

34 Given two trees it evaluates the number of s(u·,b·s)t. [sent-69, score-0.055]

35 Different TK functions are characterized by alternative fragment definitions. [sent-73, score-0.03]

36 In particular, we focus on the Syntactic Tree kernel (STK) (Collins and Duffy, 2002) and a Partial Tree Kernel (PTK) (Moschitti, 2006). [sent-74, score-0.163]

37 STK generates all possible substructures rooted in each node of the tree with the constraint that pro- duction rules can not be broken (i. [sent-75, score-0.201]

38 , any node in a tree fragment must include either all or none of its children). [sent-77, score-0.16]

39 PTK can be more effectively applied to both constituency and dependency parse trees. [sent-78, score-0.246]

40 It generalizes STK as the fragments it generates can contain any subset of nodes, i. [sent-79, score-0.052]

41 , PTK allows for breaking the production rules and generating an extremely rich feature space, which results in higher generalization ability. [sent-81, score-0.058]

42 3 Structural representations In this paper, we define simple-to-build relational structures based on: (i) a shallow syntactic tree, (ii) constituency, (iii) dependency and (iv) phrasedependency trees. [sent-83, score-0.597]

43 Shallow tree is a two-level syntactic hierarchy built from word lemmas (leaves), part-of-speech tags (preterminals) that are further organized into chunks. [sent-84, score-0.251]

44 It was shown to significantly outperform feature vector baselines for modeling relationships between question answer pairs (Severyn and Moschitti, 2012). [sent-85, score-0.182]

45 While shallow syntactic pars- ing is very fast, here we consider using constituency structures as a potentially richer source of syntactic/semantic information. [sent-87, score-0.405]

46 We propose to use dependency relations between words to derive an alternative structural representation. [sent-89, score-0.392]

47 In particular, de715 Figure 1: A phrase dependency-based structural representation of a text pair (s1, s2): A woman with a knife is slicing a pepper (s1) vs. [sent-90, score-0.576]

48 A women slicing green pepper (s2) with a high semantic similarity (human judgement score 4. [sent-91, score-0.281]

49 Related tree fragments are linked with a REL tag. [sent-94, score-0.182]

50 This reordering of the nodes helps to avoid the situation where nodes with words tend to form long chains. [sent-96, score-0.102]

51 We also plug part-of-speech tags between the word nodes and nodes carrying their grammatical role. [sent-98, score-0.193]

52 We explore a phrase- dependency tree similar to the one defined in (Wu et al. [sent-100, score-0.208]

53 It represents an alternative structure derived from the dependency tree, where the dependency relations between words belonging to the same phrase (chunk) are collapsed in a unified node. [sent-102, score-0.187]

54 , 2009), the collapsed nodes are stored as a shallow subtree rooted at the unified node. [sent-104, score-0.215]

55 This node organization is particularly suitable for PTK that effectively runs a sequence kernel on the tree fragments inside each chunk subtree. [sent-105, score-0.382]

56 Fig 1gives an example of our variation of a phrase dependency tree. [sent-106, score-0.078]

57 As a final consideration, if a document contains multiple sentences they are merged in a single tree with a common root. [sent-107, score-0.13]

58 To encode the structural relationships between documents in a pair a special REL tag is used to link the related structures. [sent-108, score-0.462]

59 Along with the direct representation of input text pairs as structural objects our framework is also capable of encoding pairwise similarity feature vectors (fvec), which we describe below. [sent-111, score-0.764]

60 (base) We adopt similarity features from two best performing systems of STS-2012, which were publicly released1 : namely, the Takelab2 system (Sˇari´ c et al. [sent-113, score-0.162]

61 Both systems represent input texts with similarity features combining multiple text similarity measures of varying complexity. [sent-116, score-0.395]

62 It also includes features derived from Explicit Semantic Analysis (Gabrilovich and Markovitch, 2007) and aggregation of word similarity based on lexical-semantic resources, e. [sent-118, score-0.162]

63 Takelab (T) includes n-gram matching of varying size, weighted word matching, length difference, WordNet similarity and vector space similarity where pairs of input sentences are mapped into Latent Semantic Analysis (LSA) space. [sent-122, score-0.379]

64 The features are computed over several sentence representations where stop words are removed and/or lemmas are used in place of raw tokens. [sent-123, score-0.174]

65 com/p/dkpro-similarityasl/wiki/SemEval2013 716 Altun, 2006), (iii) named entities, (iv) dependency triplets, and (v) PTK syntactic similarity scores computed between documents in a pair, where as input representations we use raw dependency and constituency trees. [sent-132, score-0.632]

66 To integrate multiple TK representations into a single model we apply a classifier stacking approach (Fast and Jensen, 2008). [sent-136, score-0.167]

67 Each of the learned TK models is used to generate predictions which are then plugged as features into the final fvec representation, s. [sent-137, score-0.196]

68 the final model uses only explicit feature vector representation. [sent-139, score-0.138]

69 We use the entire training data to obtain a single model for making predictions on each test set. [sent-147, score-0.03]

70 To encode TK models along with the similarity feature vectors into a single regression scoring model, we use an SVR framework implemented in SVM-Light-TK4. [sent-149, score-0.36]

71 We use the following parameter settings -t 5 -F 1 -W A -C +, which specifies a combination of trees and feature vectors (-C +), STK over trees (-F 1) (-F 3 for PTK) computed in all-vs-all mode (-W A) and polynomial kernel of degree 3 for the feature vector (active by default). [sent-150, score-0.516]

72 2 Results Table 1 summarizes the results of combining TK models with a strong feature vector model. [sent-157, score-0.103]

73 5 Conclusions and Future Work We have presented an approach where text pairs are directly treated as structural objects. [sent-162, score-0.377]

74 This provides a much richer representation for the learning algorithm to extract useful syntactic and shallow semantic patterns. [sent-163, score-0.211]

75 We have provided an extensive experimental study of four different structural representations, e. [sent-164, score-0.314]

76 shallow, constituency, dependency and phrase-dependency trees using STK and PTK. [sent-166, score-0.133]

77 via stacking; (iii) to our knowledge, this work is the first to apply structural kernels and combinations in a regression setting; and (iv) our model achieves the state of the art in STS largely improving the best previous systems. [sent-169, score-0.553]

78 Our structural learning approach to STS is conceptually simple and does not re- quire additional linguistic sources other than offthe-shelf syntactic parsers. [sent-170, score-0.407]

79 It is particularly suitable for NLP tasks where the input domain comes 5we also report the results for a concatenation of all five test sets (ALL) 717 Table 1: Results on STS-2012. [sent-171, score-0.051]

80 First set of experiments studies the combination of fvec models from UKP (U), Takelab (T) and (A). [sent-172, score-0.128]

81 Next we show results for four structural representations: shallow (S), constituency (C), dependency (D) and phrase-dependency (P) trees with STK and PTK; next row set demonstrates the necessity of relational linking for two best structures, i. [sent-173, score-0.903]

82 C and D (empty circle denotes a structures with no relational linking. [sent-175, score-0.289]

83 ); finally, domain adaptation via bags of features (B) of the entire pair and (M) manually encoded dataset type show the state of the art results. [sent-176, score-0.211]

84 Semeval-2012 task 6: A pilot on semantic textual similarity. [sent-183, score-0.096]

85 Ukp: Computing semantic textual similarity by combining multiple content similarity measures. [sent-187, score-0.344]

86 Broad-coverage sense disambiguation and information extraction with a supersense sequence tagger. [sent-191, score-0.049]

87 Tree edit models for recognizing textual entailments, para- phrases, and answers to questions. [sent-209, score-0.095]

88 Fast and effective kernels for relational learning from texts. [sent-217, score-0.389]

89 Exploiting syntactic and shallow semantic kernels for question/answer classification. [sent-221, score-0.366]

90 Efficient convolution kernels for dependency and constituent syntactic trees. [sent-225, score-0.318]

91 Kernel methods, syntax and semantics for relational text categorization. [sent-229, score-0.219]

92 Probabilistic tree-edit models with structured latent variables for textual entailment and question answering. [sent-242, score-0.103]


similar papers computed by tfidf model

tfidf for this paper:

wordName wordTfidf (topN-words)

[('ptk', 0.32), ('structural', 0.314), ('xx', 0.302), ('sts', 0.235), ('stk', 0.226), ('kernels', 0.198), ('relational', 0.191), ('moschitti', 0.175), ('constituency', 0.168), ('kernel', 0.163), ('tk', 0.132), ('tree', 0.13), ('fvec', 0.128), ('similarity', 0.124), ('takelab', 0.113), ('severyn', 0.113), ('xxx', 0.113), ('structures', 0.098), ('shallow', 0.097), ('ktk', 0.096), ('alessandro', 0.094), ('representations', 0.091), ('ukp', 0.089), ('iv', 0.088), ('dependency', 0.078), ('stacking', 0.076), ('xj', 0.072), ('iii', 0.071), ('textual', 0.067), ('kfvec', 0.064), ('pepper', 0.064), ('slicing', 0.064), ('pair', 0.063), ('ii', 0.062), ('rel', 0.061), ('feature', 0.058), ('chitt', 0.057), ('aliaksei', 0.057), ('plug', 0.057), ('vv', 0.056), ('trees', 0.055), ('fragments', 0.052), ('quarteroni', 0.052), ('xxi', 0.052), ('input', 0.051), ('conceptually', 0.051), ('nodes', 0.051), ('vectors', 0.05), ('supersense', 0.049), ('triplets', 0.047), ('scoring', 0.046), ('lemmas', 0.045), ('vector', 0.045), ('svr', 0.045), ('encoded', 0.044), ('relationships', 0.044), ('bar', 0.043), ('yy', 0.043), ('versatile', 0.043), ('representation', 0.043), ('syntactic', 0.042), ('gabrilovich', 0.042), ('encode', 0.041), ('regression', 0.041), ('mengqiu', 0.04), ('qatar', 0.04), ('novelty', 0.039), ('ciaramita', 0.039), ('heilman', 0.038), ('bags', 0.038), ('features', 0.038), ('chunk', 0.037), ('ww', 0.036), ('entailment', 0.036), ('rooted', 0.036), ('relating', 0.036), ('silvia', 0.036), ('ari', 0.036), ('pairs', 0.035), ('broken', 0.035), ('explicit', 0.035), ('tags', 0.034), ('agirre', 0.033), ('massimo', 0.033), ('semeval', 0.032), ('polynomial', 0.032), ('objects', 0.031), ('collapsed', 0.031), ('tt', 0.031), ('pairwise', 0.03), ('fragment', 0.03), ('predictions', 0.03), ('texts', 0.03), ('semantic', 0.029), ('fast', 0.029), ('answering', 0.029), ('text', 0.028), ('adaptation', 0.028), ('wu', 0.028), ('recognizing', 0.028)]

similar papers list:

simIndex simValue paperId paperTitle

same-paper 1 1.0000008 222 acl-2013-Learning Semantic Textual Similarity with Structural Representations

Author: Aliaksei Severyn ; Massimo Nicosia ; Alessandro Moschitti

Abstract: Measuring semantic textual similarity (STS) is at the cornerstone of many NLP applications. Different from the majority of approaches, where a large number of pairwise similarity features are used to represent a text pair, our model features the following: (i) it directly encodes input texts into relational syntactic structures; (ii) relies on tree kernels to handle feature engineering automatically; (iii) combines both structural and feature vector representations in a single scoring model, i.e., in Support Vector Regression (SVR); and (iv) delivers significant improvement over the best STS systems.

2 0.24716631 144 acl-2013-Explicit and Implicit Syntactic Features for Text Classification

Author: Matt Post ; Shane Bergsma

Abstract: Syntactic features are useful for many text classification tasks. Among these, tree kernels (Collins and Duffy, 2001) have been perhaps the most robust and effective syntactic tool, appealing for their empirical success, but also because they do not require an answer to the difficult question of which tree features to use for a given task. We compare tree kernels to different explicit sets of tree features on five diverse tasks, and find that explicit features often perform as well as tree kernels on accuracy and always in orders of magnitude less time, and with smaller models. Since explicit features are easy to generate and use (with publicly avail- able tools) , we suggest they should always be included as baseline comparisons in tree kernel method evaluations.

3 0.23844834 134 acl-2013-Embedding Semantic Similarity in Tree Kernels for Domain Adaptation of Relation Extraction

Author: Barbara Plank ; Alessandro Moschitti

Abstract: Relation Extraction (RE) is the task of extracting semantic relationships between entities in text. Recent studies on relation extraction are mostly supervised. The clear drawback of supervised methods is the need of training data: labeled data is expensive to obtain, and there is often a mismatch between the training data and the data the system will be applied to. This is the problem of domain adaptation. In this paper, we propose to combine (i) term generalization approaches such as word clustering and latent semantic analysis (LSA) and (ii) structured kernels to improve the adaptability of relation extractors to new text genres/domains. The empirical evaluation on ACE 2005 domains shows that a suitable combination of syntax and lexical generalization is very promising for domain adaptation.

4 0.14773178 43 acl-2013-Align, Disambiguate and Walk: A Unified Approach for Measuring Semantic Similarity

Author: Mohammad Taher Pilehvar ; David Jurgens ; Roberto Navigli

Abstract: Semantic similarity is an essential component of many Natural Language Processing applications. However, prior methods for computing semantic similarity often operate at different levels, e.g., single words or entire documents, which requires adapting the method for each data type. We present a unified approach to semantic similarity that operates at multiple levels, all the way from comparing word senses to comparing text documents. Our method leverages a common probabilistic representation over word senses in order to compare different types of linguistic data. This unified representation shows state-ofthe-art performance on three tasks: seman- tic textual similarity, word similarity, and word sense coarsening.

5 0.12629102 291 acl-2013-Question Answering Using Enhanced Lexical Semantic Models

Author: Wen-tau Yih ; Ming-Wei Chang ; Christopher Meek ; Andrzej Pastusiak

Abstract: In this paper, we study the answer sentence selection problem for question answering. Unlike previous work, which primarily leverages syntactic analysis through dependency tree matching, we focus on improving the performance using models of lexical semantic resources. Experiments show that our systems can be consistently and significantly improved with rich lexical semantic information, regardless of the choice of learning algorithms. When evaluated on a benchmark dataset, the MAP and MRR scores are increased by 8 to 10 points, compared to one of our baseline systems using only surface-form matching. Moreover, our best system also outperforms pervious work that makes use of the dependency tree structure by a wide margin.

6 0.11533061 104 acl-2013-DKPro Similarity: An Open Source Framework for Text Similarity

7 0.1101748 248 acl-2013-Modelling Annotator Bias with Multi-task Gaussian Processes: An Application to Machine Translation Quality Estimation

8 0.095273353 296 acl-2013-Recognizing Identical Events with Graph Kernels

9 0.091732778 304 acl-2013-SEMILAR: The Semantic Similarity Toolkit

10 0.084184892 221 acl-2013-Learning Non-linear Features for Machine Translation Using Gradient Boosting Machines

11 0.083908908 28 acl-2013-A Unified Morpho-Syntactic Scheme of Stanford Dependencies

12 0.08046525 80 acl-2013-Chinese Parsing Exploiting Characters

13 0.067259118 310 acl-2013-Semantic Frames to Predict Stock Price Movement

14 0.066682376 196 acl-2013-Improving pairwise coreference models through feature space hierarchy learning

15 0.066232547 297 acl-2013-Recognizing Partial Textual Entailment

16 0.064299226 19 acl-2013-A Shift-Reduce Parsing Algorithm for Phrase-based String-to-Dependency Translation

17 0.063657239 275 acl-2013-Parsing with Compositional Vector Grammars

18 0.063040562 9 acl-2013-A Lightweight and High Performance Monolingual Word Aligner

19 0.062068179 320 acl-2013-Shallow Local Multi-Bottom-up Tree Transducers in Statistical Machine Translation

20 0.061602019 328 acl-2013-Stacking for Statistical Machine Translation


similar papers computed by lsi model

lsi for this paper:

topicId topicWeight

[(0, 0.207), (1, -0.003), (2, -0.044), (3, -0.088), (4, -0.067), (5, 0.009), (6, -0.013), (7, -0.037), (8, 0.022), (9, 0.002), (10, 0.027), (11, 0.055), (12, 0.017), (13, -0.03), (14, 0.019), (15, 0.058), (16, 0.001), (17, 0.091), (18, -0.057), (19, 0.062), (20, 0.122), (21, 0.072), (22, 0.072), (23, -0.008), (24, -0.178), (25, 0.04), (26, 0.04), (27, -0.031), (28, -0.082), (29, -0.036), (30, -0.11), (31, 0.168), (32, 0.051), (33, -0.022), (34, -0.112), (35, 0.062), (36, 0.014), (37, -0.087), (38, 0.02), (39, 0.118), (40, -0.048), (41, 0.146), (42, 0.025), (43, 0.022), (44, -0.018), (45, -0.112), (46, -0.022), (47, -0.08), (48, -0.048), (49, 0.026)]

similar papers list:

simIndex simValue paperId paperTitle

same-paper 1 0.94591677 222 acl-2013-Learning Semantic Textual Similarity with Structural Representations

Author: Aliaksei Severyn ; Massimo Nicosia ; Alessandro Moschitti

Abstract: Measuring semantic textual similarity (STS) is at the cornerstone of many NLP applications. Different from the majority of approaches, where a large number of pairwise similarity features are used to represent a text pair, our model features the following: (i) it directly encodes input texts into relational syntactic structures; (ii) relies on tree kernels to handle feature engineering automatically; (iii) combines both structural and feature vector representations in a single scoring model, i.e., in Support Vector Regression (SVR); and (iv) delivers significant improvement over the best STS systems.

2 0.82840538 134 acl-2013-Embedding Semantic Similarity in Tree Kernels for Domain Adaptation of Relation Extraction

Author: Barbara Plank ; Alessandro Moschitti

Abstract: Relation Extraction (RE) is the task of extracting semantic relationships between entities in text. Recent studies on relation extraction are mostly supervised. The clear drawback of supervised methods is the need of training data: labeled data is expensive to obtain, and there is often a mismatch between the training data and the data the system will be applied to. This is the problem of domain adaptation. In this paper, we propose to combine (i) term generalization approaches such as word clustering and latent semantic analysis (LSA) and (ii) structured kernels to improve the adaptability of relation extractors to new text genres/domains. The empirical evaluation on ACE 2005 domains shows that a suitable combination of syntax and lexical generalization is very promising for domain adaptation.

3 0.80489272 144 acl-2013-Explicit and Implicit Syntactic Features for Text Classification

Author: Matt Post ; Shane Bergsma

Abstract: Syntactic features are useful for many text classification tasks. Among these, tree kernels (Collins and Duffy, 2001) have been perhaps the most robust and effective syntactic tool, appealing for their empirical success, but also because they do not require an answer to the difficult question of which tree features to use for a given task. We compare tree kernels to different explicit sets of tree features on five diverse tasks, and find that explicit features often perform as well as tree kernels on accuracy and always in orders of magnitude less time, and with smaller models. Since explicit features are easy to generate and use (with publicly avail- able tools) , we suggest they should always be included as baseline comparisons in tree kernel method evaluations.

4 0.58791381 304 acl-2013-SEMILAR: The Semantic Similarity Toolkit

Author: Vasile Rus ; Mihai Lintean ; Rajendra Banjade ; Nobal Niraula ; Dan Stefanescu

Abstract: We present in this paper SEMILAR, the SEMantic simILARity toolkit. SEMILAR implements a number of algorithms for assessing the semantic similarity between two texts. It is available as a Java library and as a Java standalone ap-plication offering GUI-based access to the implemented semantic similarity methods. Furthermore, it offers facilities for manual se-mantic similarity annotation by experts through its component SEMILAT (a SEMantic simILarity Annotation Tool). 1

5 0.56694943 104 acl-2013-DKPro Similarity: An Open Source Framework for Text Similarity

Author: Daniel Bar ; Torsten Zesch ; Iryna Gurevych

Abstract: We present DKPro Similarity, an open source framework for text similarity. Our goal is to provide a comprehensive repository of text similarity measures which are implemented using standardized interfaces. DKPro Similarity comprises a wide variety of measures ranging from ones based on simple n-grams and common subsequences to high-dimensional vector comparisons and structural, stylistic, and phonetic measures. In order to promote the reproducibility of experimental results and to provide reliable, permanent experimental conditions for future studies, DKPro Similarity additionally comes with a set of full-featured experimental setups which can be run out-of-the-box and be used for future systems to built upon.

6 0.49232113 262 acl-2013-Offspring from Reproduction Problems: What Replication Failure Teaches Us

7 0.49139509 12 acl-2013-A New Set of Norms for Semantic Relatedness Measures

8 0.48858356 310 acl-2013-Semantic Frames to Predict Stock Price Movement

9 0.47342142 163 acl-2013-From Natural Language Specifications to Program Input Parsers

10 0.46970174 291 acl-2013-Question Answering Using Enhanced Lexical Semantic Models

11 0.4659566 112 acl-2013-Dependency Parser Adaptation with Subtrees from Auto-Parsed Target Domain Data

12 0.46438134 248 acl-2013-Modelling Annotator Bias with Multi-task Gaussian Processes: An Application to Machine Translation Quality Estimation

13 0.46354279 346 acl-2013-The Impact of Topic Bias on Quality Flaw Prediction in Wikipedia

14 0.46260509 242 acl-2013-Mining Equivalent Relations from Linked Data

15 0.45463225 43 acl-2013-Align, Disambiguate and Walk: A Unified Approach for Measuring Semantic Similarity

16 0.44503745 176 acl-2013-Grounded Unsupervised Semantic Parsing

17 0.44382706 96 acl-2013-Creating Similarity: Lateral Thinking for Vertical Similarity Judgments

18 0.43410584 165 acl-2013-General binarization for parsing and translation

19 0.43323475 383 acl-2013-Vector Space Model for Adaptation in Statistical Machine Translation

20 0.43014136 215 acl-2013-Large-scale Semantic Parsing via Schema Matching and Lexicon Extension


similar papers computed by lda model

lda for this paper:

topicId topicWeight

[(0, 0.083), (6, 0.041), (11, 0.057), (19, 0.209), (24, 0.04), (26, 0.08), (28, 0.013), (35, 0.088), (42, 0.076), (48, 0.036), (64, 0.015), (70, 0.077), (88, 0.016), (90, 0.013), (95, 0.071)]

similar papers list:

simIndex simValue paperId paperTitle

1 0.88091218 340 acl-2013-Text-Driven Toponym Resolution using Indirect Supervision

Author: Michael Speriosu ; Jason Baldridge

Abstract: Toponym resolvers identify the specific locations referred to by ambiguous placenames in text. Most resolvers are based on heuristics using spatial relationships between multiple toponyms in a document, or metadata such as population. This paper shows that text-driven disambiguation for toponyms is far more effective. We exploit document-level geotags to indirectly generate training instances for text classifiers for toponym resolution, and show that textual cues can be straightforwardly integrated with other commonly used ones. Results are given for both 19th century texts pertaining to the American Civil War and 20th century newswire articles.

same-paper 2 0.84725046 222 acl-2013-Learning Semantic Textual Similarity with Structural Representations

Author: Aliaksei Severyn ; Massimo Nicosia ; Alessandro Moschitti

Abstract: Measuring semantic textual similarity (STS) is at the cornerstone of many NLP applications. Different from the majority of approaches, where a large number of pairwise similarity features are used to represent a text pair, our model features the following: (i) it directly encodes input texts into relational syntactic structures; (ii) relies on tree kernels to handle feature engineering automatically; (iii) combines both structural and feature vector representations in a single scoring model, i.e., in Support Vector Regression (SVR); and (iv) delivers significant improvement over the best STS systems.

3 0.79207623 126 acl-2013-Diverse Keyword Extraction from Conversations

Author: Maryam Habibi ; Andrei Popescu-Belis

Abstract: A new method for keyword extraction from conversations is introduced, which preserves the diversity of topics that are mentioned. Inspired from summarization, the method maximizes the coverage of topics that are recognized automatically in transcripts of conversation fragments. The method is evaluated on excerpts of the Fisher and AMI corpora, using a crowdsourcing platform to elicit comparative relevance judgments. The results demonstrate that the method outperforms two competitive baselines.

4 0.7859025 60 acl-2013-Automatic Coupling of Answer Extraction and Information Retrieval

Author: Xuchen Yao ; Benjamin Van Durme ; Peter Clark

Abstract: Information Retrieval (IR) and Answer Extraction are often designed as isolated or loosely connected components in Question Answering (QA), with repeated overengineering on IR, and not necessarily performance gain for QA. We propose to tightly integrate them by coupling automatically learned features for answer extraction to a shallow-structured IR model. Our method is very quick to implement, and significantly improves IR for QA (measured in Mean Average Precision and Mean Reciprocal Rank) by 10%-20% against an uncoupled retrieval baseline in both document and passage retrieval, which further leads to a downstream 20% improvement in QA F1.

5 0.76173019 4 acl-2013-A Context Free TAG Variant

Author: Ben Swanson ; Elif Yamangil ; Eugene Charniak ; Stuart Shieber

Abstract: We propose a new variant of TreeAdjoining Grammar that allows adjunction of full wrapping trees but still bears only context-free expressivity. We provide a transformation to context-free form, and a further reduction in probabilistic model size through factorization and pooling of parameters. This collapsed context-free form is used to implement efficient gram- mar estimation and parsing algorithms. We perform parsing experiments the Penn Treebank and draw comparisons to TreeSubstitution Grammars and between different variations in probabilistic model design. Examination of the most probable derivations reveals examples of the linguistically relevant structure that our variant makes possible.

6 0.75957483 317 acl-2013-Sentence Level Dialect Identification in Arabic

7 0.69689775 223 acl-2013-Learning a Phrase-based Translation Model from Monolingual Data with Application to Domain Adaptation

8 0.68818688 343 acl-2013-The Effect of Higher-Order Dependency Features in Discriminative Phrase-Structure Parsing

9 0.68541414 369 acl-2013-Unsupervised Consonant-Vowel Prediction over Hundreds of Languages

10 0.68250918 276 acl-2013-Part-of-Speech Induction in Dependency Trees for Statistical Machine Translation

11 0.68178815 132 acl-2013-Easy-First POS Tagging and Dependency Parsing with Beam Search

12 0.68084419 134 acl-2013-Embedding Semantic Similarity in Tree Kernels for Domain Adaptation of Relation Extraction

13 0.6796298 46 acl-2013-An Infinite Hierarchical Bayesian Model of Phrasal Translation

14 0.67812389 275 acl-2013-Parsing with Compositional Vector Grammars

15 0.67795789 159 acl-2013-Filling Knowledge Base Gaps for Distant Supervision of Relation Extraction

16 0.67755103 80 acl-2013-Chinese Parsing Exploiting Characters

17 0.67710197 164 acl-2013-FudanNLP: A Toolkit for Chinese Natural Language Processing

18 0.67477703 212 acl-2013-Language-Independent Discriminative Parsing of Temporal Expressions

19 0.67315632 70 acl-2013-Bilingually-Guided Monolingual Dependency Grammar Induction

20 0.6722576 82 acl-2013-Co-regularizing character-based and word-based models for semi-supervised Chinese word segmentation