acl acl2012 acl2012-184 knowledge-graph by maker-knowledge-mining
Source: pdf
Author: Fan Bu ; Hang Li ; Xiaoyan Zhu
Abstract: Learning for sentence re-writing is a fundamental task in natural language processing and information retrieval. In this paper, we propose a new class of kernel functions, referred to as string re-writing kernel, to address the problem. A string re-writing kernel measures the similarity between two pairs of strings, each pair representing re-writing of a string. It can capture the lexical and structural similarity between two pairs of sentences without the need of constructing syntactic trees. We further propose an instance of string rewriting kernel which can be computed efficiently. Experimental results on benchmark datasets show that our method can achieve better results than state-of-the-art methods on two sentence re-writing learning tasks: paraphrase identification and recognizing textual entailment.
Reference: text
sentIndex sentText sentNum sentScore
1 In this paper, we propose a new class of kernel functions, referred to as string re-writing kernel, to address the problem. [sent-11, score-0.572]
2 A string re-writing kernel measures the similarity between two pairs of strings, each pair representing re-writing of a string. [sent-12, score-0.653]
3 We further propose an instance of string rewriting kernel which can be computed efficiently. [sent-14, score-0.64]
4 Experimental results on benchmark datasets show that our method can achieve better results than state-of-the-art methods on two sentence re-writing learning tasks: paraphrase identification and recognizing textual entailment. [sent-15, score-0.439]
5 1 Introduction Learning for sentence re-writing is a fundamental task in natural language processing and information retrieval, which includes paraphrasing, textual entailment and transformation between query and document title in search. [sent-16, score-0.23]
6 In previous research on sentence re-writing learning such as paraphrase identification and recognizing textual entailment, most representations are based on the lexicons (Zhang and Patrick, 2005; Lintean and Rus, 2011; de Marneffe et al. [sent-18, score-0.439]
7 In (Lin and Pantel, 2001 ; Barzilay and Lee, 2003), re-writing rules serve as underlying representations for paraphrase generation/discovery. [sent-26, score-0.233]
8 Specifically, we propose a new class of kernel functions (Sch o¨lkopf and Smola, 2002), called string rewriting kernel (SRK), which defines the similarity between two re-writings (pairs) of strings as the inner product between them in the feature space induced by all the re-writing rules. [sent-30, score-1.109]
9 SRK is different from existing kernels in that it is for re-writing and defined on two pairs of strings. [sent-31, score-0.231]
10 We are able to develop an instance of SRK, referred to as kb-SRK, which directly computes the number of common rewriting rules without explicProce dJienjgus, R ofep thueb 5lic0t hof A Knonruea ,l M 8-e1e4ti Jnugly o f2 t0h1e2 A. [sent-36, score-0.181]
11 Experimental results on benchmark datasets show that SRK achieves better results than the state-ofthe-art methods in paraphrase identification and recognizing textual entailment. [sent-39, score-0.41]
12 2 Related Work The string kernel function, first proposed by Lodhi et al. [sent-42, score-0.538]
13 (2002) proposed the k-spectrum kernel which represents strings by their contiguous substrings of length k. [sent-45, score-0.437]
14 (2004) further proposed a number of string kernels including the wildcard kernel to facilitate inexact matching between the strings. [sent-47, score-0.969]
15 The string kernels defined on two pairs of objects (including strings) were also developed, which decompose the similarity into product of similarities between individual objects using tensor product (Basilico and Hofmann, 2004; Ben-Hur and Noble, 2005) or Cartesian product (Kashima et al. [sent-48, score-0.522]
16 The task of paraphrasing usually consists of paraphrase pattern generation and paraphrase identification. [sent-50, score-0.43]
17 Paraphrase identification is to identify whether two given sentences are a paraphrase of each other. [sent-52, score-0.241]
18 The methods proposed so far formalized the problem as classification and used various types of features such as bag-of-words feature, edit distance (Zhang and Patrick, 2005), dissimilarity kernel (Lintean and Rus, 2011) predicate-argument structure (Qiu et al. [sent-53, score-0.388]
19 The task of recognizing textual entailment is to decide whether the hypothesis sentence can be entailed by the premise sentence (Giampiccolo et al. [sent-58, score-0.341]
20 (2007) used a kernel method on syntactic tree pairs (Moschitti and Zanzotto, 2007). [sent-66, score-0.401]
21 Following the literature of string kernel, we use the terms “string” and “character” instead of “sentence” and “word”. [sent-68, score-0.195]
22 , ((sn,tn),yn) ∈ (Σ∗ Σ∗) Y where Σ denotes the character set, Σ∗ = Si∞=0 Σi denotes the string set, which is the KleeneS Sclosure of set Σ, Y denotes the set of responses, and n is the number of instances. [sent-72, score-0.318]
23 (si, ti) is a re-writing consisting of the source string si and the target string ti. [sent-73, score-0.43]
24 eG tihvaetn Y a new string re-writing (s, t) ∈ pΣ∗a pΣh∗r,a our goal eisn t oa predict i tnsg response y. [sent-78, score-0.195]
25 T (sh,att) is, the× training data consists of binary classes of string re-writings, and the prediction is made for the new re-writing based on learning from the training data. [sent-79, score-0.195]
26 We take the kernel approach to address the learning task. [sent-80, score-0.343]
27 The kernel on re-writings of strings is defined as K : (Σ∗ Σ∗) (Σ∗ Σ∗) → R satisfying for all (si, ti), (sj, tj) ∈ Σ∗ Σ∗, K((si, ti) , (sj, tj)) = hΦ(si, ti) ,Φ(sj, tj)i where Φ maps each re-writing (pair) of strings into a high dimensional Hilbert space H , referred to as feature space. [sent-81, score-0.611]
28 By the representer theorem (Kimeldorf and Wahba, 1971 ; Sch o¨lkopf and Smola, 2002), it can be shown that the response y of a new string re-writing (s, t) can always be represented as y = sign(i∑=n1αiyiK((si,ti),(s,t))) where αi ≥ 0, (i = 1, · · · ,n) are parameters. [sent-82, score-0.236]
29 The question then becomes how to design the kernel function for the task. [sent-87, score-0.343]
30 Let wildcard domain D ⊆ Σ∗ be the set of strings . [sent-89, score-0.301]
31 The string re-writing kernel measures the similar- Σ × ity between two string re-writings through the rewriting rules that can be applied into them. [sent-91, score-0.88]
32 ] A re-writing rule is defined as a triple r = (βs, βt, τ) where βs,βt ∈ (Σ ∪ {∗})∗ denote source and target string patterns a (nΣd τ ⊆ ind∗ (βs) ind∗ (βt) danendottaersg ethtset alignments bsaentwdeτen ⊆ tihned wild)ca×rdins din the two string patterns. [sent-93, score-0.516]
33 Here ind∗ (β) denotes the set of indexes of wildcards in . [sent-94, score-0.176]
34 βItt tm =atc (∗h,ews aws,ithw rthitete string pair nind Fig. [sent-99, score-0.252]
35 String re-writing kernel is a class of kernels which depends on re-writing rule set R and wildcard domain D. [sent-101, score-0.82]
36 We define the pairwise k-spectrum kernel (ps-SRK) Kkps as the re-writing rule kernel under R = {(βs, βt, τ) |βs, βt ∈ Σk, τ = 0/ } and any dDe. [sent-105, score-0.822]
37 RIt can βbe sh,τo)w|nβ that∈ Kkps((s1,t1), (s2,t2)) = Kskpec(s1,s2)Kskpec(t1,t2) where Kskpec(x,y) is equivalent to the k-spectrum kernel proposed by Leslie et al. [sent-106, score-0.343]
38 The pairwise k-wildcard kernel (pwSRK) Kkpw is defined as the re-writing rule kernel under R = {(βs, βt, τ) |βs, βt ∈ (Σ∪ {∗})k, τ = 0/ } and uDn = Σr . [sent-109, score-0.822]
39 Both kernels shown above are represented as the product of two kernels defined separately on strings s1 , s2 and t1,t2, and that is to say that they do not consider the alignment relations between the strings. [sent-112, score-0.503]
40 5 K-gram Bijective String Re-writing Kernel Next we propose another instance of string rewriting kernel, called the k-gram bijective string rewriting kernel (kb-SRK). [sent-113, score-1.088]
41 As will be seen, kb-SRK can be computed efficiently, although it is defined on two pairs of strings and is not decomposed (note that ps-SRK and pw-SRK are decomposed). [sent-114, score-0.152]
42 1 Definition The kb-SRK has the following properties: (1) A wildcard can only substitute a single character, denoted as “? [sent-116, score-0.207]
43 (2) The two string patterns in a rewriting rule are of length k. [sent-118, score-0.394]
44 , there is a one-to-one mapping between the wildcards in the string patterns. [sent-121, score-0.33]
45 Formally, the k-gram bijective string re-writing kernel Kk is defined as a string re-writing kernel under the re-writing rule set R = {(βs, βt, τ) |βs, βt ∈ (Σ∪ {? [sent-122, score-1.324]
46 { Since each re-writing rule contains two string patterns of length k and each wildcard can only substitute one character, a re-writing rule can only match k-gram pairs in (s, t). [sent-124, score-0.654]
47 (5) needs to perform matching of all the rewriting rules with the two k-gram pairs (αs1 , αt1 ), (αs2, αt2), which has time complexity O(k! [sent-132, score-0.256]
48 1 Transformation of Problem For ease of manipulation, our method transforms the computation of kernel on k-grams into the computation on a new data structure called lists of doubles. [sent-138, score-0.563]
49 Thus Σ Σ denotes the set of doubles and αsD, αtD ∈ (Σ 452 α? [sent-142, score-0.238]
50 ) Figure 3: Example of the pair of double lists combined from the two k-gram pairs in Fig. [sent-161, score-0.481]
51 3 shows⊗ example lists of doubles combined from k-grams. [sent-169, score-0.319]
52 We introduce the set of identical doubles I= × {(c, c) |c ∈ Σ} and the set of non-identical doubles {N( = {(c, c0) |c, c a0n ∈ Σ th haend se c c0}. [sent-170, score-0.394]
53 ×WΣe adnedfin IeT the set of re-writing rules for double lists RD = {rD = (βsD,βtD,τ)|βsD,βtD ∈ (I∪ {? [sent-172, score-0.411]
54 })k,τ = is a bijective alignment} w,τh)e|rβe βsD an∈d (βIt∪D are l)ists iosf aid beinjetcictaivle d aoluigbnlemse including wβildcards and with length k. [sent-173, score-0.151]
55 We say rule rD matches a pair of double lists (αDs, αtD) iff. [sent-174, score-0.597]
56 βsD,βtD can be changed into αsD and αtD by substituting each wildcard pair to a double in Σ Σ , and the double substituting the wildbcalerd i pair ×βΣsD[ ,i] a anndd t hβetD d[oj] mbluest s ubbes an itidnegnti tcheal wdiolud-ble when there is an alignment (i, j) ∈ τ. [sent-175, score-0.879]
57 4 (B) shows an example of re-writing rule for double lists. [sent-179, score-0.341]
58 2 Computing K¯k We consider how to compute K¯k by extending the computation from k-grams to double lists. [sent-184, score-0.322]
59 The following lemma shows that computing the weighted sum of re-writing rules matching k-gram pairs (αs1 , αt1 ) and (αs2 , αt2) is equivalent to com- puting the weighted sum of re-writing rules for double lists matching (αs1 ⊗ αs2 , αt1 ⊗ αt2). [sent-185, score-0.666]
60 (B) Figure 4: For re-writing rule (A) matching both k-gram pairs shown in Fig. [sent-198, score-0.206]
61 2, there is a corresponding re-writing rule for double lists (B) matching the pair of double lists shown in Fig. [sent-199, score-0.937]
62 ): 2, (c, c): 3} Figure 5: Example of #Σ×Σ(·) for the two double lists shown in Fig. [sent-211, score-0.366]
63 For any two k-gram pairs (αs1 , αt1 ) and (αs2 , αt2), there exists a one-to-one mapping from the set of re-writing rules matching them to the set of re-writing rules matching the corresponding double lists (αs1 ⊗ αs2 , αt1 ⊗ αt2). [sent-215, score-0.616]
64 Equivalently, the re-writing rule for double lists in Fig. [sent-219, score-0.463]
65 5, we have K¯k=rD∑∈RDφ¯rD(αs1⊗αs2,αt1⊗αt2) (6) where φ¯rD(αDs, αtD) = λ2i if the rewriting rule for double lists rD with iwildcards matches (αDs, αtD), otherwise φ¯rD(αDs, αtD) = 0. [sent-223, score-0.649]
66 To get K¯k, we just need to compute the weighted sum of re-writing rules for double lists matching (αs1 ⊗ αs2 ,αt1 ⊗ αt2). [sent-224, score-0.491]
67 Thus, we can work on the “combi⊗nedα” pair o⊗f αdouble lists instead of two pairs of k-grams. [sent-225, score-0.237]
68 Instead of enumerating all possible re-writing rules and checking whether they can match the given pair of double lists, we only calculate the number of possibilities of “generating” from the pair of double lists to the re-writing rules matching it, which can be carried out efficiently. [sent-226, score-0.865]
69 We say that a re-writing rule ofdouble lists can be generated from a pair ofdouble lists (αDs, αtD), if they match with each other. [sent-227, score-0.505]
70 6, K¯k only depends on the number of times each double occurs in the double × lists. [sent-230, score-0.488]
71 We denote #e(αD) as the number of times e occurs in the list of doubles αD. [sent-232, score-0.226]
72 Also, for a set of doubles S ⊆ Σ Σ, we denote #S (αD) as a vre act soert oinf w dohiucbhle eesa Sch ⊆ ⊆e Σlem×eΣn,t represents # #e(αD) of each double e ∈ S. [sent-233, score-0.47]
73 Here we use to denote the number of possibilities for which ipairs of aligned wildcards can be generated from e in both αsD and αtD. [sent-241, score-0.211]
74 In (1), since #e(αsD) #e(αtD), it is impossible to generate a re-writing )ru6 =le # in which all the occurrences of non-identical double e are substituted by pairs of aligned wildcards. [sent-257, score-0.389]
75 In (2), j pairs of aligned wildcards can be generated from all the occurrences of non-identical double e in both αsD and αtD. [sent-258, score-0.484]
76 In (3), a pair of aligned wildcards can either be generated or not from a pair of identical doubles in αsD and αtD. [sent-261, score-0.493]
77 We can select ioccurrences of identical double e from αsD, ioccurrences from αtD, and generate all possible aligned wildcards from them. [sent-262, score-0.502]
78 It prepares two maps ms and mt and two vectors of counters cs and ct. [sent-277, score-0.161]
79 We group the k-gram pairs by their key in lines 2-5 and lines 6-9. [sent-289, score-0.272]
80 algorithm are implemented by hash maps, the time complexities of lines 2-5 and lines 6-9 are O(kn2). [sent-311, score-0.221]
81 6 Experiments We evaluated the performances of the three types of string re-writing kernels on paraphrase identification and recognizing textual entailment: pairwise kspectrum kernel (ps-SRK), pairwise k-wildcard kernel (pw-SRK), and k-gram bijective string re-writing kernel (kb-SRK). [sent-331, score-2.286]
82 We 455 normalized each kernel by K˜(x,y) = and then tried them under different √KK(x,(xx),Ky)(y,y) window sizes k. [sent-343, score-0.414]
83 We also tried to combine the kernels with two lexical features “unigram precision and recall” proposed in (Wan et al. [sent-344, score-0.173]
84 For each kernel K, we tested the window size settings of K1 + . [sent-346, score-0.414]
85 1 Paraphrase Identification The task of paraphrase identification is to examine whether two sentences have the same meaning. [sent-351, score-0.241]
86 , 2004) consisting of 4,076 sentence pairs for training and 1,725 sentence pairs for testing. [sent-353, score-0.174]
87 kb-SRK outperforms the existing lexical approach (Zhang and Patrick, 2005) and kernel approach (Lintean and Rus, 2011). [sent-357, score-0.343]
88 7 gives detailed results of the kernels under different maximum k-gram lengths kmax with and without PR. [sent-360, score-0.324]
89 54 561234 pkPsbRw_ S RaSRK bK+ P R*P window size kmax Figure 7: Performances of different kernels under different maximum window size kmax on MSRP. [sent-377, score-0.617]
90 Furthermore, the performances of kb-SRK with and without combining PR increase dramatically with increasing kmax and reach the peaks (better than state-of-the-art) when kmax is four, which shows the power of the lex- ical and structural similarity captured by kb-SRK. [sent-380, score-0.357]
91 2 Recognizing Textual Entailment Recognizing textual entailment is to determine whether a sentence (sometimes a short paragraph) can entail the other sentence (Giampiccolo et al. [sent-382, score-0.259]
92 5 5 1234pk Ppsb Rw_ _S SRSR RK K + P PR R window size kmax Figure 8: Performances of different kernels under different maximum window size kmax on RTE-3. [sent-405, score-0.617]
93 8 shows the results of the kernels under different parameter settings. [sent-410, score-0.173]
94 7 Conclusion In this paper, we have proposed a novel class of kernel functions for sentence re-writing, called string re-writing kernel (SRK). [sent-414, score-0.91]
95 Experimental results show that kb-SRK achieve better results than state-of-the-art methods on paraphrase identification and recognizing textual entailment. [sent-418, score-0.41]
96 An extensible probabilistic transformation-based approach to the third recognizing textual entailment challenge. [sent-500, score-0.283]
97 Tree edit models for recognizing textual entailments, paraphrases, and answers to questions. [sent-507, score-0.169]
98 The spectrum kernel: a string kernel for SVM protein classification. [sent-542, score-0.613]
99 Fast string kernels using inexact matching for protein sequences. [sent-549, score-0.494]
100 Proceedings of the ACL-PASCAL workshop on textual entailment and paraphrasing, pp. [sent-619, score-0.201]
wordName wordTfidf (topN-words)
[('kernel', 0.343), ('td', 0.301), ('srk', 0.245), ('double', 0.244), ('sd', 0.227), ('wildcard', 0.207), ('doubles', 0.197), ('string', 0.195), ('paraphrase', 0.188), ('kernels', 0.173), ('bijective', 0.151), ('kmax', 0.151), ('wildcards', 0.135), ('lists', 0.122), ('entailment', 0.114), ('kk', 0.108), ('rewriting', 0.102), ('leslie', 0.099), ('heilman', 0.098), ('rule', 0.097), ('lintean', 0.094), ('rus', 0.094), ('zanzotto', 0.094), ('strings', 0.094), ('lines', 0.088), ('rd', 0.087), ('textual', 0.087), ('recognizing', 0.082), ('smith', 0.077), ('protein', 0.075), ('ms', 0.073), ('ds', 0.072), ('window', 0.071), ('pairs', 0.058), ('pair', 0.057), ('kskpec', 0.057), ('noble', 0.057), ('performances', 0.055), ('paraphrasing', 0.054), ('pr', 0.053), ('identification', 0.053), ('marneffe', 0.052), ('matching', 0.051), ('lemma', 0.05), ('harmeling', 0.049), ('computation', 0.049), ('ai', 0.047), ('aligned', 0.047), ('dolan', 0.046), ('matches', 0.046), ('maps', 0.046), ('dissimilarity', 0.045), ('complexities', 0.045), ('das', 0.045), ('rules', 0.045), ('sch', 0.043), ('patrick', 0.043), ('ind', 0.042), ('counters', 0.042), ('giampiccolo', 0.042), ('brockett', 0.042), ('wan', 0.042), ('denotes', 0.041), ('theorem', 0.041), ('si', 0.04), ('lkopf', 0.04), ('substituted', 0.04), ('pairwise', 0.039), ('key', 0.038), ('tj', 0.038), ('msr', 0.038), ('smola', 0.038), ('baldrige', 0.038), ('basilico', 0.038), ('ioccurrences', 0.038), ('iwildcards', 0.038), ('kbsrk', 0.038), ('kimeldorf', 0.038), ('kkps', 0.038), ('kkpw', 0.038), ('lodhi', 0.038), ('ofdouble', 0.038), ('pwsrk', 0.038), ('wahba', 0.038), ('qiu', 0.036), ('substituting', 0.035), ('referred', 0.034), ('moschitti', 0.033), ('vt', 0.033), ('vapnik', 0.033), ('kashima', 0.033), ('maccartney', 0.032), ('product', 0.032), ('say', 0.031), ('sj', 0.03), ('ned', 0.03), ('paraphrases', 0.03), ('compute', 0.029), ('sentence', 0.029), ('denote', 0.029)]
simIndex simValue paperId paperTitle
same-paper 1 1.0000005 184 acl-2012-String Re-writing Kernel
Author: Fan Bu ; Hang Li ; Xiaoyan Zhu
Abstract: Learning for sentence re-writing is a fundamental task in natural language processing and information retrieval. In this paper, we propose a new class of kernel functions, referred to as string re-writing kernel, to address the problem. A string re-writing kernel measures the similarity between two pairs of strings, each pair representing re-writing of a string. It can capture the lexical and structural similarity between two pairs of sentences without the need of constructing syntactic trees. We further propose an instance of string rewriting kernel which can be computed efficiently. Experimental results on benchmark datasets show that our method can achieve better results than state-of-the-art methods on two sentence re-writing learning tasks: paraphrase identification and recognizing textual entailment.
2 0.19104326 116 acl-2012-Improve SMT Quality with Automatically Extracted Paraphrase Rules
Author: Wei He ; Hua Wu ; Haifeng Wang ; Ting Liu
Abstract: unkown-abstract
3 0.18118362 146 acl-2012-Modeling Topic Dependencies in Hierarchical Text Categorization
Author: Alessandro Moschitti ; Qi Ju ; Richard Johansson
Abstract: In this paper, we encode topic dependencies in hierarchical multi-label Text Categorization (TC) by means of rerankers. We represent reranking hypotheses with several innovative kernels considering both the structure of the hierarchy and the probability of nodes. Additionally, to better investigate the role ofcategory relationships, we consider two interesting cases: (i) traditional schemes in which node-fathers include all the documents of their child-categories; and (ii) more general schemes, in which children can include documents not belonging to their fathers. The extensive experimentation on Reuters Corpus Volume 1 shows that our rerankers inject effective structural semantic dependencies in multi-classifiers and significantly outperform the state-of-the-art.
4 0.17307733 183 acl-2012-State-of-the-Art Kernels for Natural Language Processing
Author: Alessandro Moschitti
Abstract: unkown-abstract
5 0.16162774 125 acl-2012-Joint Learning of a Dual SMT System for Paraphrase Generation
Author: Hong Sun ; Ming Zhou
Abstract: SMT has been used in paraphrase generation by translating a source sentence into another (pivot) language and then back into the source. The resulting sentences can be used as candidate paraphrases ofthe source sentence. Existing work that uses two independently trained SMT systems cannot directly optimize the paraphrase results. Paraphrase criteria especially the paraphrase rate is not able to be ensured in that way. In this paper, we propose a joint learning method of two SMT systems to optimize the process of paraphrase generation. In addition, a revised BLEU score (called iBLEU) which measures the adequacy and diversity of the generated paraphrase sentence is proposed for tuning parameters in SMT systems. Our experiments on NIST 2008 testing data with automatic evaluation as well as human judgments suggest that the proposed method is able to enhance the paraphrase quality by adjusting between semantic equivalency and surface dissimilarity.
7 0.11996128 214 acl-2012-Verb Classification using Distributional Similarity in Syntactic and Semantic Structures
8 0.09056595 65 acl-2012-Crowdsourcing Inference-Rule Evaluation
9 0.089662924 72 acl-2012-Detecting Semantic Equivalence and Information Disparity in Cross-lingual Documents
10 0.083994433 139 acl-2012-MIX Is Not a Tree-Adjoining Language
11 0.082103014 78 acl-2012-Efficient Search for Transformation-based Inference
12 0.071395412 80 acl-2012-Efficient Tree-based Approximation for Entailment Graph Learning
13 0.071083769 36 acl-2012-BIUTEE: A Modular Open-Source System for Recognizing Textual Entailment
14 0.059962999 12 acl-2012-A Graph-based Cross-lingual Projection Approach for Weakly Supervised Relation Extraction
16 0.057626069 82 acl-2012-Entailment-based Text Exploration with Application to the Health-care Domain
17 0.057465855 53 acl-2012-Combining Textual Entailment and Argumentation Theory for Supporting Online Debates Interactions
18 0.054262701 181 acl-2012-Spectral Learning of Latent-Variable PCFGs
19 0.05301765 154 acl-2012-Native Language Detection with Tree Substitution Grammars
20 0.052525058 92 acl-2012-FLOW: A First-Language-Oriented Writing Assistant System
topicId topicWeight
[(0, -0.178), (1, 0.009), (2, -0.018), (3, -0.041), (4, 0.01), (5, 0.041), (6, -0.055), (7, 0.291), (8, -0.118), (9, -0.059), (10, -0.106), (11, 0.047), (12, 0.143), (13, 0.094), (14, 0.193), (15, 0.19), (16, -0.077), (17, 0.007), (18, -0.068), (19, 0.111), (20, 0.032), (21, 0.042), (22, 0.004), (23, -0.016), (24, -0.137), (25, -0.06), (26, -0.022), (27, 0.003), (28, -0.055), (29, -0.056), (30, -0.045), (31, 0.098), (32, -0.033), (33, 0.053), (34, -0.023), (35, -0.033), (36, -0.078), (37, 0.017), (38, -0.081), (39, 0.006), (40, 0.068), (41, 0.002), (42, -0.015), (43, -0.018), (44, -0.032), (45, 0.068), (46, 0.028), (47, 0.023), (48, 0.02), (49, 0.022)]
simIndex simValue paperId paperTitle
same-paper 1 0.95668221 184 acl-2012-String Re-writing Kernel
Author: Fan Bu ; Hang Li ; Xiaoyan Zhu
Abstract: Learning for sentence re-writing is a fundamental task in natural language processing and information retrieval. In this paper, we propose a new class of kernel functions, referred to as string re-writing kernel, to address the problem. A string re-writing kernel measures the similarity between two pairs of strings, each pair representing re-writing of a string. It can capture the lexical and structural similarity between two pairs of sentences without the need of constructing syntactic trees. We further propose an instance of string rewriting kernel which can be computed efficiently. Experimental results on benchmark datasets show that our method can achieve better results than state-of-the-art methods on two sentence re-writing learning tasks: paraphrase identification and recognizing textual entailment.
2 0.66385895 146 acl-2012-Modeling Topic Dependencies in Hierarchical Text Categorization
Author: Alessandro Moschitti ; Qi Ju ; Richard Johansson
Abstract: In this paper, we encode topic dependencies in hierarchical multi-label Text Categorization (TC) by means of rerankers. We represent reranking hypotheses with several innovative kernels considering both the structure of the hierarchy and the probability of nodes. Additionally, to better investigate the role ofcategory relationships, we consider two interesting cases: (i) traditional schemes in which node-fathers include all the documents of their child-categories; and (ii) more general schemes, in which children can include documents not belonging to their fathers. The extensive experimentation on Reuters Corpus Volume 1 shows that our rerankers inject effective structural semantic dependencies in multi-classifiers and significantly outperform the state-of-the-art.
3 0.62415671 183 acl-2012-State-of-the-Art Kernels for Natural Language Processing
Author: Alessandro Moschitti
Abstract: unkown-abstract
4 0.53356481 116 acl-2012-Improve SMT Quality with Automatically Extracted Paraphrase Rules
Author: Wei He ; Hua Wu ; Haifeng Wang ; Ting Liu
Abstract: unkown-abstract
5 0.49341851 125 acl-2012-Joint Learning of a Dual SMT System for Paraphrase Generation
Author: Hong Sun ; Ming Zhou
Abstract: SMT has been used in paraphrase generation by translating a source sentence into another (pivot) language and then back into the source. The resulting sentences can be used as candidate paraphrases ofthe source sentence. Existing work that uses two independently trained SMT systems cannot directly optimize the paraphrase results. Paraphrase criteria especially the paraphrase rate is not able to be ensured in that way. In this paper, we propose a joint learning method of two SMT systems to optimize the process of paraphrase generation. In addition, a revised BLEU score (called iBLEU) which measures the adequacy and diversity of the generated paraphrase sentence is proposed for tuning parameters in SMT systems. Our experiments on NIST 2008 testing data with automatic evaluation as well as human judgments suggest that the proposed method is able to enhance the paraphrase quality by adjusting between semantic equivalency and surface dissimilarity.
7 0.46844137 214 acl-2012-Verb Classification using Distributional Similarity in Syntactic and Semantic Structures
8 0.43589857 65 acl-2012-Crowdsourcing Inference-Rule Evaluation
9 0.38851976 139 acl-2012-MIX Is Not a Tree-Adjoining Language
10 0.36270586 92 acl-2012-FLOW: A First-Language-Oriented Writing Assistant System
11 0.35928473 72 acl-2012-Detecting Semantic Equivalence and Information Disparity in Cross-lingual Documents
12 0.34199041 181 acl-2012-Spectral Learning of Latent-Variable PCFGs
13 0.31808197 215 acl-2012-WizIE: A Best Practices Guided Development Environment for Information Extraction
14 0.31568438 80 acl-2012-Efficient Tree-based Approximation for Entailment Graph Learning
15 0.29771072 174 acl-2012-Semantic Parsing with Bayesian Tree Transducers
16 0.28371462 185 acl-2012-Strong Lexicalization of Tree Adjoining Grammars
17 0.28174165 133 acl-2012-Learning to "Read Between the Lines" using Bayesian Logic Programs
18 0.26904705 197 acl-2012-Tokenization: Returning to a Long Solved Problem A Survey, Contrastive Experiment, Recommendations, and Toolkit
19 0.2643753 78 acl-2012-Efficient Search for Transformation-based Inference
20 0.25856495 186 acl-2012-Structuring E-Commerce Inventory
topicId topicWeight
[(3, 0.279), (25, 0.014), (26, 0.041), (28, 0.038), (30, 0.044), (37, 0.054), (39, 0.069), (48, 0.026), (59, 0.015), (74, 0.036), (82, 0.017), (84, 0.022), (85, 0.031), (90, 0.083), (92, 0.07), (94, 0.015), (99, 0.063)]
simIndex simValue paperId paperTitle
Author: Eric Xing
Abstract: Probabilistic topic models have recently gained much popularity in informational retrieval and related areas. Via such models, one can project high-dimensional objects such as text documents into a low dimensional space where their latent semantics are captured and modeled; can integrate multiple sources of information—to ”share statistical strength” among components of a hierarchical probabilistic model; and can structurally display and classify the otherwise unstructured object collections. However, to many practitioners, how topic models work, what to and not to expect from a topic model, how is it different from and related to classical matrix algebraic techniques such as LSI, NMF in NLP, how to empower topic models to deal with complex scenarios such as multimodal data, contractual text in social media, evolving corpus, or presence of supervision such as labeling and rating, how to make topic modeling computationally tractable even on webscale data, etc., in a principled way, remain unclear. In this tutorial, I will demystify the conceptual, mathematical, and computational issues behind all such problems surrounding the topic models and their applications by presenting a systematic overview of the mathematical foundation of topic modeling, and its connections to a number of related methods popular in other fields such as the LDA, admixture model, mixed membership model, latent space models, and sparse coding. Iwill offer a simple and unifying view of all these techniques under the framework multi-view latent space embedding, and online the roadmap of model extension and algorithmic design to3 ward different applications in IR and NLP. A main theme of this tutorial that tie together a wide range of issues and problems will build on the ”probabilistic graphical model” formalism, a formalism that exploits the conjoined talents of graph theory and probability theory to build complex models out of simpler pieces. Iwill use this formalism as a main aid to discuss both the mathematical underpinnings for the models and the related computational issues in a unified, simplistic, transparent, and actionable fashion. Jeju, Republic of Korea,T 8ut Jourliya 2l0 A1b2s.tr ?ac c2t0s1 o2f A ACssLo 2c0ia1t2io,n p faogre C 3o,mputational Linguistics
same-paper 2 0.73911875 184 acl-2012-String Re-writing Kernel
Author: Fan Bu ; Hang Li ; Xiaoyan Zhu
Abstract: Learning for sentence re-writing is a fundamental task in natural language processing and information retrieval. In this paper, we propose a new class of kernel functions, referred to as string re-writing kernel, to address the problem. A string re-writing kernel measures the similarity between two pairs of strings, each pair representing re-writing of a string. It can capture the lexical and structural similarity between two pairs of sentences without the need of constructing syntactic trees. We further propose an instance of string rewriting kernel which can be computed efficiently. Experimental results on benchmark datasets show that our method can achieve better results than state-of-the-art methods on two sentence re-writing learning tasks: paraphrase identification and recognizing textual entailment.
3 0.48172387 80 acl-2012-Efficient Tree-based Approximation for Entailment Graph Learning
Author: Jonathan Berant ; Ido Dagan ; Meni Adler ; Jacob Goldberger
Abstract: Learning entailment rules is fundamental in many semantic-inference applications and has been an active field of research in recent years. In this paper we address the problem of learning transitive graphs that describe entailment rules between predicates (termed entailment graphs). We first identify that entailment graphs exhibit a “tree-like” property and are very similar to a novel type of graph termed forest-reducible graph. We utilize this property to develop an iterative efficient approximation algorithm for learning the graph edges, where each iteration takes linear time. We compare our approximation algorithm to a recently-proposed state-of-the-art exact algorithm and show that it is more efficient and scalable both theoretically and empirically, while its output quality is close to that given by the optimal solution of the exact algorithm.
4 0.47791818 84 acl-2012-Estimating Compact Yet Rich Tree Insertion Grammars
Author: Elif Yamangil ; Stuart Shieber
Abstract: We present a Bayesian nonparametric model for estimating tree insertion grammars (TIG), building upon recent work in Bayesian inference of tree substitution grammars (TSG) via Dirichlet processes. Under our general variant of TIG, grammars are estimated via the Metropolis-Hastings algorithm that uses a context free grammar transformation as a proposal, which allows for cubic-time string parsing as well as tree-wide joint sampling of derivations in the spirit of Cohn and Blunsom (2010). We use the Penn treebank for our experiments and find that our proposal Bayesian TIG model not only has competitive parsing performance but also finds compact yet linguistically rich TIG representations of the data.
5 0.47648928 174 acl-2012-Semantic Parsing with Bayesian Tree Transducers
Author: Bevan Jones ; Mark Johnson ; Sharon Goldwater
Abstract: Many semantic parsing models use tree transformations to map between natural language and meaning representation. However, while tree transformations are central to several state-of-the-art approaches, little use has been made of the rich literature on tree automata. This paper makes the connection concrete with a tree transducer based semantic parsing model and suggests that other models can be interpreted in a similar framework, increasing the generality of their contributions. In particular, this paper further introduces a variational Bayesian inference algorithm that is applicable to a wide class of tree transducers, producing state-of-the-art semantic parsing results while remaining applicable to any domain employing probabilistic tree transducers.
6 0.47322422 214 acl-2012-Verb Classification using Distributional Similarity in Syntactic and Semantic Structures
7 0.47056186 206 acl-2012-UWN: A Large Multilingual Lexical Knowledge Base
8 0.46967536 21 acl-2012-A System for Real-time Twitter Sentiment Analysis of 2012 U.S. Presidential Election Cycle
9 0.46688724 191 acl-2012-Temporally Anchored Relation Extraction
10 0.46525553 132 acl-2012-Learning the Latent Semantics of a Concept from its Definition
11 0.46518546 139 acl-2012-MIX Is Not a Tree-Adjoining Language
12 0.46511132 36 acl-2012-BIUTEE: A Modular Open-Source System for Recognizing Textual Entailment
13 0.46389943 130 acl-2012-Learning Syntactic Verb Frames using Graphical Models
14 0.46298531 28 acl-2012-Aspect Extraction through Semi-Supervised Modeling
15 0.46222931 146 acl-2012-Modeling Topic Dependencies in Hierarchical Text Categorization
16 0.46215361 63 acl-2012-Cross-lingual Parse Disambiguation based on Semantic Correspondence
17 0.46212119 10 acl-2012-A Discriminative Hierarchical Model for Fast Coreference at Large Scale
18 0.46150166 167 acl-2012-QuickView: NLP-based Tweet Search
19 0.46136245 31 acl-2012-Authorship Attribution with Author-aware Topic Models
20 0.46089569 175 acl-2012-Semi-supervised Dependency Parsing using Lexical Affinities