emnlp emnlp2011 emnlp2011-109 knowledge-graph by maker-knowledge-mining

109 emnlp-2011-Random Walk Inference and Learning in A Large Scale Knowledge Base

Source: pdf

Author: Ni Lao ; Tom Mitchell ; William W. Cohen

Abstract: t om . We consider the problem of performing learning and inference in a large scale knowledge base containing imperfect knowledge with incomplete coverage. We show that a soft inference procedure based on a combination of constrained, weighted, random walks through the knowledge base graph can be used to reliably infer new beliefs for the knowledge base. More specifically, we show that the system can learn to infer different target relations by tuning the weights associated with random walks that follow different paths through the graph, using a version of the Path Ranking Algorithm (Lao and Cohen, 2010b). We apply this approach to a knowledge base of approximately 500,000 beliefs extracted imperfectly from the web by NELL, a never-ending language learner (Carlson et al., 2010). This new system improves significantly over NELL’s earlier Horn-clause learning and inference method: it obtains nearly double the precision at rank 100, and the new learning method is also applicable to many more inference tasks.

Reference: text

Summary: the most important sentenses genereted by tfidf model

sentIndex sentText sentNum sentScore

1 We consider the problem of performing learning and inference in a large scale knowledge base containing imperfect knowledge with incomplete coverage. [sent-5, score-0.323]

2 We show that a soft inference procedure based on a combination of constrained, weighted, random walks through the knowledge base graph can be used to reliably infer new beliefs for the knowledge base. [sent-6, score-0.672]

3 More specifically, we show that the system can learn to infer different target relations by tuning the weights associated with random walks that follow different paths through the graph, using a version of the Path Ranking Algorithm (Lao and Cohen, 2010b). [sent-7, score-0.372]

4 We apply this approach to a knowledge base of approximately 500,000 beliefs extracted imperfectly from the web by NELL, a never-ending language learner (Carlson et al. [sent-8, score-0.327]

5 This new system improves significantly over NELL’s earlier Horn-clause learning and inference method: it obtains nearly double the precision at rank 100, and the new learning method is also applicable to many more inference tasks. [sent-10, score-0.202]

6 edu l inference methods are too brittle to be used to make complex inferences from automatically-extracted knowledge, and probabilistic inference methods (Richardson and Domingos, 2006) suffer from scalability problems. [sent-21, score-0.251]

7 This paper considers the problem of constructing inference methods that can scale to large knowledge bases (KB’s), and that are robust to imperfect knowledge. [sent-22, score-0.168]

8 As of March 2011, NELL had built a knowledge base containing several million candidate beliefs which it had extracted from the web with varying confidence. [sent-30, score-0.278]

9 Each day, NELL has two tasks: (1) to extract additional beliefs from the web to populate its growing knowledge base (KB) with instances of the categories and relations in its ontology, and (2) to learn to perform task 1 better today than it could yesterday. [sent-43, score-0.314]

10 At present its only inference method beyond simple inheritance involves applying first order Horn clause rules to infer new beliefs from current beliefs. [sent-59, score-0.395]

11 NELL currently has a set of approximately 600 such rules, which it has learned by data mining its knowledge base of beliefs. [sent-61, score-0.155]

12 After a clause is learned, the examples covered by that clause are removed from the training set, and the process repeats until no positive examples remain. [sent-68, score-0.174]

13 , the starting point for a predicate R is found by looking at positive instances R(a, b) of the consequent, and finding a clause that corresponds to a bounded-length path of binary relations that link a to b. [sent-81, score-0.347]

14 In the example above, a start clause might be the clause (1). [sent-82, score-0.174]

15 One important role of these inference rules is that they contribute to the bootstrapping procedure, as inferences made by N-FOIL increase either the number of candidate beliefs, or (if the inference is already a candidate) improve NELL’s confidence in candidate beliefs. [sent-87, score-0.292]

16 PRA learns to rank graph nodes b relative to a query node a. [sent-90, score-0.277]

17 PRA begins by enumerating a large set of bounded-length edge-labeled path types, similar to the initial clauses used in NELL’s variant of FOIL. [sent-91, score-0.182]

18 These path types are treated as ranking “experts”, 531 each performing a random walk through the graph, constrained to follow that sequence of edge types, and ranking nodes b by their weights in the resulting distribution. [sent-92, score-0.483]

19 If HinesWard is linked to the single concept node ProfessionalAthlete via isa, the walk will reach that node with probability 1 after one step. [sent-95, score-0.329]

20 If L is tphreo bsaebt iolift yath 1l/e|tAic| leagues ga andt a‘n ∈ L, ∈let A A‘ bfe L Lth ies sthete o sef atto hfle attehsl eitni league e‘:s aanftder ‘th ∈ree L steps, the walk will have probability |A‘ |/|A| of being at any point wb ∈ La. [sent-97, score-0.32]

21 v I pnr short, tthye | ranking afs bsoeciniagte adt awnyith p othinist path gives nth seh prior probability o asf a vcaialuteed db being an athletic league for a—which is useful as a feature in a combined ranking method, although not by itself a high-precision inference rule. [sent-98, score-0.479]

22 Note that the rankings produced by this “expert” will change as the knowledge base evolves—for instance, if the system learns about proportionally more soccer players than hockey players over time, then the league rankings for the path of clause (3) will change. [sent-99, score-0.688]

23 Also, the ranking is specific to the query node a. [sent-100, score-0.245]

24 Then the path for clause (1) above will give lower weight to b = NFL for a = EliManning than to b = NFL for a = HinesWard. [sent-102, score-0.224]

25 Compared to Horn clause inference, the key characteristics of this new inference method are as follows: 2San Francisco’s Major-League Baseball and New York’s National Football League teams are both called the “Giants”. [sent-104, score-0.276]

26 • The evidence in support of inferring a relation iTnhsetan evcied R(a, b) uisp boartse odf on many existing paths between a and b in the current KB, combined using a learned logistic function. [sent-105, score-0.206]

27 • • • The confidence in an inference is sensitive to tThhee ec curornefnitd setnactee ionf t ahne knowledge base, iatnivde eth teo specific entities being queried (since the paths used in the inference have these properties). [sent-106, score-0.394]

28 Experimentally, the inference method yields many more moderately-confident hiondfer yeinecledss than the Horn clauses learned by N-FOIL. [sent-107, score-0.146]

29 The learning and inference are more efficient tThhaen N-FOIL, nind part r beenccaeus aer we can exploit efficient approximation schemes for random walks (Lao and Cohen, 2010a). [sent-108, score-0.241]

30 The resulting inference is as fast as 10 milliseconds per query on average. [sent-109, score-0.227]

31 The Path Ranking Algorithm (PRA) we use is similar to that described elsewhere (Lao and Cohen, 2010b), except that to achieve efficient model learning, the paths between a and b are determined by the statistics from a population of training queries rather than enumerated completely. [sent-110, score-0.244]

32 PRA uses random walks to generate relational features on graph data, and combine them with a logistic regression model. [sent-111, score-0.224]

33 FOIL, Markov Logic Networks), PRA is extremely efficient at link prediction or retrieval tasks, in which we are interested in identifying top links from a large number of candidates, instead of focusing on a particular node pair orjoint inferences. [sent-114, score-0.143]

34 , 2006) answers list queries on a large knowledge base produced by open domain information extraction. [sent-117, score-0.246]

35 Spreading activation is used to measure the closeness of any node to the query term nodes. [sent-118, score-0.205]

36 This approach is similar to the random walk with restart approach which is used as a baseline in our experiment. [sent-119, score-0.177]

37 A Markov network corresponding to the grounding of these rules to the knowledge base is constructed for each query, and then belief propagation is used for inference. [sent-124, score-0.196]

38 In comparison, our proposed approach is much more efficient by avoiding the harder problem of joint inferences and by leveraging efficient random walk schemes (Lao and Cohen, 2010a). [sent-127, score-0.226]

39 2 Approach In this section, we first describe how we formulate link (relation) prediction on a knowledge base as a ranking task. [sent-129, score-0.259]

40 1 Learning with NELL’s Knowledge Base For each relation R in the knowledge base we train a model for the link prediction task: given a concept a, find all other concepts b which potentially have the relation R(a, b). [sent-135, score-0.366]

41 This prediction is made based on an existing knowledge base extracted imperfectly from the web. [sent-136, score-0.237]

42 To ensure a reasonable number of training instances, we generate labeled training example queries from 48 relations which have more than 100 instances in the knowledge base. [sent-138, score-0.166]

43 Each node a which has relation R in the knowledge base with any other node is treated as a training query, the actual nodes b in the knowledge base known to satisfy R(a, b) are treated as labeled positive examples, and any other nodes are treated as negative examples. [sent-142, score-0.587]

44 A relation path P is defined as a sequence of relations R1 . [sent-145, score-0.226]

45 R‘ and a seed node s ∈ domain(P), a path constrained rsaeenddo nmo dweal sk d ∈efin deosm a diins(trPib)u,ti aon p hs,P recursively as follows. [sent-156, score-0.216]

46 R‘−1, and define hs,P(e) = X hs,P0(e0) · P(e|e0;R‘), (5) e0∈rXange(P0) |RR‘‘((ee00,,e·) | where P(e|e0; R‘) = is the probability of reaching node e from node e0, with a one step random 533 walk with edge type R‘. [sent-164, score-0.391]

47 , Pn, one could treat each hs,Pi (e) as a path feature for the node e, and rank nodes by a linear model θ1hs,P1 (e) + θ2hs,P2 (e) + . [sent-169, score-0.249]

48 This gives a ranking of nodes e related to the query node s by the following scoring function score(e;s) = X hs,P(e)θP, (6) PX∈P‘ where P‘ is the set of relation paths with length ≤ ‘. [sent-173, score-0.484]

49 3 Data-Driven Path Finding In prior work with PRA, P‘ was defined as all relation paths of length at most ‘. [sent-183, score-0.206]

50 When the number of edge types is small, one can generate P‘ by Table 1: Number of paths in PRA models of maximum path length 3 and 4. [sent-184, score-0.346]

51 ‘=3‘=4 all paths up to length L15,3761,906,624 +query support≥ α = 0. [sent-186, score-0.153]

52 , a knowledge base), it is impractical to enumerate all possible relation paths even for small ‘. [sent-189, score-0.245]

53 For instance, if the number of edge types related to each node type is 100, even the number of length three paths types easily reaches millions. [sent-190, score-0.288]

54 For other domains like parsed natural language sentences, useful relation paths can be as long as ten relations (Minkov and Cohen, 2008). [sent-191, score-0.242]

55 In this case, even with smaller number of possible edge types, the total number of relation paths is still too large for systematic enumeration. [sent-192, score-0.262]

56 In order to apply PRA to these domains, we modify the path generation procedure in PRA to produce only relation paths which are potentially useful for the task. [sent-193, score-0.343]

57 Define a query s to be supporting a path P if hs,P(e) 0 for any entity e. [sent-194, score-0.263]

58 We require that any path nod(ee) )cr6 =ea 0te fdo during path finding enqeueidres to be supported by at least a fraction α of the training queries si, as well as being of length no more than ‘ (In the experiments, we set α = 0. [sent-195, score-0.365]

59 01) We also require that in order for a relation path to be included in the PRA model, it must retrieve at least one target entity ti in the training set. [sent-196, score-0.19]

60 As we can see from Table 1, together these two constraints dramatically reduce the number of relation paths that need to be considered, relative to systematically enumerating all possible relation paths. [sent-197, score-0.259]

61 = The idea of finding paths that connects nodes in a graph is not new. [sent-199, score-0.225]

62 These approaches consider a single query during path finding. [sent-202, score-0.263]

63 In comparison, the data-driven path finding method we described here uses statistics from a population of queries, and therefore can potentially determine the importance of a path more reliably. [sent-203, score-0.274]

64 4 Low-Variance Sampling Lao and Cohen (2010a) previously showed that sampling techniques like finger printing and particle filtering can significantly speedup random walk without sacrificing retrieval quality. [sent-217, score-0.421]

65 For example, consider a node in the graph with just two out links with equal weights, and suppose we are required to generate two walkers starting from this node. [sent-219, score-0.175]

66 3 Results This section reports empirical results of applying random walk inference to NELL’s knowledge base after the 165th iteration of its learning process. [sent-230, score-0.433]

67 Figure 2 compares independent and low variance sampling when applied to finger printing and particle filtering (Lao and Cohen, 2010a). [sent-249, score-0.203]

68 The horizontal axis corresponds to the speedup of random walk compared with exact inference, and the vertical axis measures the quality of prediction by MRR with three fold cross validation on the training query set. [sent-250, score-0.377]

69 Generally models with length 4 paths produce slightly better results, but are 4-5 times slower to train 4For a set of queries Q, MRR Pq∈Q = |Q1| rank of the first co1rrect answer for q 535 Random Walk Speedup Figure 2: Compare inference speed and quality over 96 tasks. [sent-252, score-0.345]

70 sampling can improve prediction for both finger printing and particle filtering. [sent-254, score-0.236]

71 Interestingly, when using a large number of walkers, the finger printing methods produce even better prediction quality than exact inference. [sent-257, score-0.147]

72 Lao and Cohen noticed a similar improvement on retrieval tasks, and conjectured that it is because the sampling inference imposes a regularization penalty on longer relation paths (2010a). [sent-258, score-0.347]

73 2 Evaluation by Mechanical Turk The cross-validation result above assumes that the knowledge base is complete and correct, which we know to be untrue. [sent-260, score-0.155]

74 To accurately compare PRA and N-FOIL’s ability to reliably infer new beliefs from an imperfect knowledge base, we use human assessments obtained from Amazon Mechanical Turk. [sent-261, score-0.265]

75 Among all the 72 non-functional predicates—which Table 3: The top two weighted PRA paths for tasks on which N-FOIL discovers confident rules. [sent-265, score-0.187]

76 sport teams in the league) played by other teams in the league) Using paired t-test at task level, PRA is not statistically different from N-FOIL for p@ 10 (p-value=0. [sent-268, score-0.254]

77 9, 11, 14), rules which leverage information about the synonyms of the query node (e. [sent-279, score-0.246]

78 7-8, 10, 12), and rules which leverage information from a local neighborhood of the query node (e. [sent-281, score-0.246]

79 The synonym paths are useful, because an entity may have multiple names on the web. [sent-284, score-0.153]

80 We find that all 17 general rules (no specialization) learned by N-FOIL can be expressed as length two relation paths such as path 11. [sent-285, score-0.384]

81 For each relation R to be evaluated, we generate test queries s which belong to domain(R). [sent-287, score-0.144]

82 For each query node s, we applied a trained model (either PRA or N-FOIL) to generate a ranked list of candidate t nodes. [sent-289, score-0.205]

83 Since there are about 7 thousand (and 13 thousand) test queries s for each functional (and non-functional) predicate R, and there are (potentially) thousands of candidates t returned for each query s, we cannot evaluate all candidates of all queries. [sent-294, score-0.309]

84 Therefore, we first sort the queries s for each predicate R by the scores of their top ranked candidate t in descending order, and then calculate precisions at top 10, 100 and 1000 positions for the list of result R(sR,1, . [sent-295, score-0.178]

85 , where t1R,1), R(sR,2,t1R,2), sR,1 is the first query for predicate R, t1R,1 is its first tca1R,n2di dsa it es, sfiRrs,2t cisa tnhdeid saetceo,n sdo q ouner ayn fdor so pre fodritcha. [sent-298, score-0.182]

86 Each belief is evaluated by 5 workers at Mechanical Turk, who are given assertions like “Hines Ward plays for the team Steelers”, as well as Google search links for each entity, and the combination of both entities. [sent-301, score-0.152]

87 The Pmajority column shows for each predicate the accuracy achieved by the majority prediction: given a query R(a, ? [sent-307, score-0.182]

88 In comparison, PRA is able to produce results for 6,599 queries on average for each functional predicate, and 12,5 19 queries on average for each non-functional predicate. [sent-313, score-0.218]

89 However, learning more accurate specialized paths is part of our future work. [sent-318, score-0.186]

90 Compared to the result on functional predicates, precisions at 10 and at 100 of non-functional predicates are slightly lower, but precisions at 1000 are comparable. [sent-321, score-0.164]

91 4 Conclusions and Future Work We have shown that a soft inference procedure based on a combination of constrained, weighted, random walks through the knowledge base graph can be used to reliably infer new beliefs for the knowledge base. [sent-332, score-0.672]

92 We applied this approach to a knowledge base of approximately 500,000 beliefs extracted imperfectly from the web by NELL. [sent-333, score-0.327]

93 The inference and learning are both very efficient—our experiment shows that the inference time is as fast as 10 milliseconds per query on average, and the 538 training for a predicate takes only a few seconds. [sent-335, score-0.384]

94 First, inference starting from both the query nodes and target nodes (Richards and Mooney, 1992) can be much more efficient in discovering long paths than just inference from the query nodes. [sent-337, score-0.673]

95 Second, inference starting from the target nodes of training queries is a potential way to discover specialized paths (with grounded nodes). [sent-338, score-0.411]

96 Third, generalizing inference paths to inference trees or graphs can produce more expressive random walk inference models. [sent-339, score-0.633]

97 Overall, we believe that random walk is a promising way to scale up relational learning to domains with very large data sets. [sent-340, score-0.222]

98 Gordon for the suggestion of applying low variance sampling to random walk inference. [sent-343, score-0.217]

99 Fast query execution for retrieval models based on path-constrained random walks. [sent-408, score-0.173]

100 Learning graph walk based similarity measures for parsed text. [sent-418, score-0.169]

similar papers computed by tfidf model

tfidf for this paper:

wordName wordTfidf (topN-words)

[('pra', 0.559), ('nell', 0.346), ('lao', 0.215), ('league', 0.161), ('paths', 0.153), ('path', 0.137), ('walk', 0.13), ('athleteplaysinleague', 0.129), ('rwr', 0.129), ('query', 0.126), ('horn', 0.124), ('isa', 0.124), ('beliefs', 0.123), ('base', 0.116), ('inference', 0.101), ('walks', 0.093), ('queries', 0.091), ('ay', 0.09), ('cohen', 0.088), ('teams', 0.088), ('clause', 0.087), ('team', 0.084), ('kb', 0.083), ('node', 0.079), ('gu', 0.078), ('players', 0.074), ('um', 0.074), ('mrr', 0.073), ('foil', 0.072), ('workers', 0.068), ('predicates', 0.066), ('stadium', 0.062), ('finger', 0.057), ('lvs', 0.057), ('nfl', 0.057), ('printing', 0.057), ('walkers', 0.057), ('le', 0.057), ('predicate', 0.056), ('edge', 0.056), ('relation', 0.053), ('mechanical', 0.053), ('athlete', 0.049), ('richards', 0.049), ('inferences', 0.049), ('imperfectly', 0.049), ('sport', 0.049), ('particle', 0.049), ('random', 0.047), ('te', 0.046), ('turk', 0.045), ('ue', 0.045), ('clauses', 0.045), ('relational', 0.045), ('forbes', 0.043), ('hinesward', 0.043), ('professionalathlete', 0.043), ('robotics', 0.043), ('unspecialized', 0.043), ('infer', 0.043), ('om', 0.042), ('speedup', 0.041), ('ea', 0.041), ('concept', 0.041), ('rules', 0.041), ('ranking', 0.04), ('cafarella', 0.04), ('sampling', 0.04), ('knowledge', 0.039), ('graph', 0.039), ('ss', 0.038), ('logic', 0.038), ('carlson', 0.037), ('preconditions', 0.037), ('ty', 0.037), ('ad', 0.036), ('functional', 0.036), ('relations', 0.036), ('consequent', 0.034), ('confident', 0.034), ('nodes', 0.033), ('prediction', 0.033), ('specialized', 0.033), ('reliably', 0.032), ('banko', 0.032), ('precisions', 0.031), ('link', 0.031), ('sports', 0.029), ('played', 0.029), ('athleteplaysforteam', 0.029), ('bhalotia', 0.029), ('factrank', 0.029), ('ium', 0.029), ('leagues', 0.029), ('pmajority', 0.029), ('teamplaysinleague', 0.029), ('thrun', 0.029), ('untrained', 0.029), ('imperfect', 0.028), ('si', 0.027)]

similar papers list:

simIndex simValue paperId paperTitle

same-paper 1 0.99999982 109 emnlp-2011-Random Walk Inference and Learning in A Large Scale Knowledge Base

Author: Ni Lao ; Tom Mitchell ; William W. Cohen

2 0.26068518 40 emnlp-2011-Discovering Relations between Noun Categories

Author: Thahir Mohamed ; Estevam Hruschka ; Tom Mitchell

Abstract: Traditional approaches to Relation Extraction from text require manually defining the relations to be extracted. We propose here an approach to automatically discovering relevant relations, given a large text corpus plus an initial ontology defining hundreds of noun categories (e.g., Athlete, Musician, Instrument). Our approach discovers frequently stated relations between pairs of these categories, using a two step process. For each pair of categories (e.g., Musician and Instrument) it first coclusters the text contexts that connect known instances of the two categories, generating a candidate relation for each resulting cluster. It then applies a trained classifier to determine which of these candidate relations is semantically valid. Our experiments apply this to a text corpus containing approximately 200 million web pages and an ontology containing 122 categories from the NELL system [Carlson et al., 2010b], producing a set of 781 proposed can- didate relations, approximately half of which are semantically valid. We conclude this is a useful approach to semi-automatic extension of the ontology for large-scale information extraction systems such as NELL. 1

3 0.13343954 90 emnlp-2011-Linking Entities to a Knowledge Base with Query Expansion

Author: Swapna Gottipati ; Jing Jiang

Abstract: In this paper we present a novel approach to entity linking based on a statistical language model-based information retrieval with query expansion. We use both local contexts and global world knowledge to expand query language models. We place a strong emphasis on named entities in the local contexts and explore a positional language model to weigh them differently based on their distances to the query. Our experiments on the TAC-KBP 2010 data show that incorporating such contextual information indeed aids in disambiguating the named entities and consistently improves the entity linking performance. Compared with the official results from KBP 2010 participants, our system shows competitive performance.

4 0.10212965 128 emnlp-2011-Structured Relation Discovery using Generative Models

Author: Limin Yao ; Aria Haghighi ; Sebastian Riedel ; Andrew McCallum

Abstract: We explore unsupervised approaches to relation extraction between two named entities; for instance, the semantic bornIn relation between a person and location entity. Concretely, we propose a series of generative probabilistic models, broadly similar to topic models, each which generates a corpus of observed triples of entity mention pairs and the surface syntactic dependency path between them. The output of each model is a clustering of observed relation tuples and their associated textual expressions to underlying semantic relation types. Our proposed models exploit entity type constraints within a relation as well as features on the dependency path between entity mentions. We examine effectiveness of our approach via multiple evaluations and demonstrate 12% error reduction in precision over a state-of-the-art weakly supervised baseline.

5 0.081793807 29 emnlp-2011-Collaborative Ranking: A Case Study on Entity Linking

Author: Zheng Chen ; Heng Ji

Abstract: In this paper, we present a new ranking scheme, collaborative ranking (CR). In contrast to traditional non-collaborative ranking scheme which solely relies on the strengths of isolated queries and one stand-alone ranking algorithm, the new scheme integrates the strengths from multiple collaborators of a query and the strengths from multiple ranking algorithms. We elaborate three specific forms of collaborative ranking, namely, micro collaborative ranking (MiCR), macro collaborative ranking (MaCR) and micro-macro collab- orative ranking (MiMaCR). Experiments on entity linking task show that our proposed scheme is indeed effective and promising.

6 0.061939619 57 emnlp-2011-Extreme Extraction - Machine Reading in a Week

7 0.060874626 113 emnlp-2011-Relation Acquisition using Word Classes and Partial Patterns

8 0.059158709 28 emnlp-2011-Closing the Loop: Fast, Interactive Semi-Supervised Annotation With Queries on Features and Instances

9 0.057232115 5 emnlp-2011-A Fast Re-scoring Strategy to Capture Long-Distance Dependencies

10 0.053039189 114 emnlp-2011-Relation Extraction with Relation Topics

11 0.052659094 26 emnlp-2011-Class Label Enhancement via Related Instances

12 0.05068329 81 emnlp-2011-Learning General Connotation of Words using Graph-based Algorithms

13 0.049006071 116 emnlp-2011-Robust Disambiguation of Named Entities in Text

14 0.048762619 17 emnlp-2011-Active Learning with Amazon Mechanical Turk

15 0.04842525 50 emnlp-2011-Evaluating Dependency Parsing: Robust and Heuristics-Free Cross-Annotation Evaluation

16 0.047903433 7 emnlp-2011-A Joint Model for Extended Semantic Role Labeling

17 0.047556471 14 emnlp-2011-A generative model for unsupervised discovery of relations and argument classes from clinical texts

18 0.047422107 145 emnlp-2011-Unsupervised Semantic Role Induction with Graph Partitioning

19 0.047097303 100 emnlp-2011-Optimal Search for Minimum Error Rate Training

20 0.046946704 15 emnlp-2011-A novel dependency-to-string model for statistical machine translation

similar papers computed by lsi model

lsi for this paper:

topicId topicWeight

[(0, 0.17), (1, -0.089), (2, -0.088), (3, 0.005), (4, -0.026), (5, -0.177), (6, 0.049), (7, -0.089), (8, -0.013), (9, 0.169), (10, 0.074), (11, 0.014), (12, -0.174), (13, 0.003), (14, 0.113), (15, -0.061), (16, 0.174), (17, -0.047), (18, 0.022), (19, -0.025), (20, 0.203), (21, -0.078), (22, 0.098), (23, 0.276), (24, -0.015), (25, 0.06), (26, -0.093), (27, -0.272), (28, -0.029), (29, 0.065), (30, 0.249), (31, 0.264), (32, -0.018), (33, -0.156), (34, -0.063), (35, 0.059), (36, 0.076), (37, -0.02), (38, -0.097), (39, 0.061), (40, 0.097), (41, -0.037), (42, 0.029), (43, 0.037), (44, -0.015), (45, 0.039), (46, 0.019), (47, -0.001), (48, 0.013), (49, -0.033)]

similar papers list:

simIndex simValue paperId paperTitle

same-paper 1 0.93648422 109 emnlp-2011-Random Walk Inference and Learning in A Large Scale Knowledge Base

Author: Ni Lao ; Tom Mitchell ; William W. Cohen

2 0.85886234 40 emnlp-2011-Discovering Relations between Noun Categories

Author: Thahir Mohamed ; Estevam Hruschka ; Tom Mitchell

3 0.34514603 90 emnlp-2011-Linking Entities to a Knowledge Base with Query Expansion

Author: Swapna Gottipati ; Jing Jiang

4 0.28580853 29 emnlp-2011-Collaborative Ranking: A Case Study on Entity Linking

Author: Zheng Chen ; Heng Ji

5 0.2120297 26 emnlp-2011-Class Label Enhancement via Related Instances

Author: Zornitsa Kozareva ; Konstantin Voevodski ; Shanghua Teng

Abstract: Class-instance label propagation algorithms have been successfully used to fuse information from multiple sources in order to enrich a set of unlabeled instances with class labels. Yet, nobody has explored the relationships between the instances themselves to enhance an initial set of class-instance pairs. We propose two graph-theoretic methods (centrality and regularization), which start with a small set of labeled class-instance pairs and use the instance-instance network to extend the class labels to all instances in the network. We carry out a comparative study with state-of-the-art knowledge harvesting algorithm and show that our approach can learn additional class labels while maintaining high accuracy. We conduct a comparative study between class-instance and instance-instance graphs used to propagate the class labels and show that the latter one achieves higher accuracy.

6 0.20825656 47 emnlp-2011-Efficient retrieval of tree translation examples for Syntax-Based Machine Translation

7 0.20636378 57 emnlp-2011-Extreme Extraction - Machine Reading in a Week

8 0.20053969 114 emnlp-2011-Relation Extraction with Relation Topics

9 0.1942219 128 emnlp-2011-Structured Relation Discovery using Generative Models

10 0.19341691 81 emnlp-2011-Learning General Connotation of Words using Graph-based Algorithms

11 0.1886503 14 emnlp-2011-A generative model for unsupervised discovery of relations and argument classes from clinical texts

12 0.1857871 126 emnlp-2011-Structural Opinion Mining for Graph-based Sentiment Representation

13 0.1772912 5 emnlp-2011-A Fast Re-scoring Strategy to Capture Long-Distance Dependencies

14 0.17666221 96 emnlp-2011-Multilayer Sequence Labeling

15 0.16572921 7 emnlp-2011-A Joint Model for Extended Semantic Role Labeling

16 0.16402487 113 emnlp-2011-Relation Acquisition using Word Classes and Partial Patterns

17 0.16260421 64 emnlp-2011-Harnessing different knowledge sources to measure semantic relatedness under a uniform model

18 0.16051328 111 emnlp-2011-Reducing Grounded Learning Tasks To Grammatical Inference

19 0.1599936 132 emnlp-2011-Syntax-Based Grammaticality Improvement using CCG and Guided Search

20 0.15793489 32 emnlp-2011-Computing Logical Form on Regulatory Texts

similar papers computed by lda model

lda for this paper:

topicId topicWeight

[(23, 0.101), (36, 0.019), (37, 0.021), (45, 0.075), (47, 0.017), (54, 0.017), (57, 0.031), (62, 0.039), (64, 0.019), (66, 0.025), (69, 0.387), (79, 0.033), (82, 0.017), (87, 0.012), (90, 0.02), (96, 0.044), (98, 0.025)]

similar papers list:

simIndex simValue paperId paperTitle

1 0.83820879 51 emnlp-2011-Exact Decoding of Phrase-Based Translation Models through Lagrangian Relaxation

Author: Yin-Wen Chang ; Michael Collins

Abstract: This paper describes an algorithm for exact decoding of phrase-based translation models, based on Lagrangian relaxation. The method recovers exact solutions, with certificates of optimality, on over 99% of test examples. The method is much more efficient than approaches based on linear programming (LP) or integer linear programming (ILP) solvers: these methods are not feasible for anything other than short sentences. We compare our method to MOSES (Koehn et al., 2007), and give precise estimates of the number and magnitude of search errors that MOSES makes.

same-paper 2 0.82616627 109 emnlp-2011-Random Walk Inference and Learning in A Large Scale Knowledge Base

Author: Ni Lao ; Tom Mitchell ; William W. Cohen

3 0.75038373 73 emnlp-2011-Improving Bilingual Projections via Sparse Covariance Matrices

Author: Jagadeesh Jagarlamudi ; Raghavendra Udupa ; Hal Daume III ; Abhijit Bhole

Abstract: Mapping documents into an interlingual representation can help bridge the language barrier of cross-lingual corpora. Many existing approaches are based on word co-occurrences extracted from aligned training data, represented as a covariance matrix. In theory, such a covariance matrix should represent semantic equivalence, and should be highly sparse. Unfortunately, the presence of noise leads to dense covariance matrices which in turn leads to suboptimal document representations. In this paper, we explore techniques to recover the desired sparsity in covariance matrices in two ways. First, we explore word association measures and bilingual dictionaries to weigh the word pairs. Later, we explore different selection strategies to remove the noisy pairs based on the association scores. Our experimental results on the task of aligning comparable documents shows the efficacy of sparse covariance matrices on two data sets from two different language pairs.

4 0.74643356 31 emnlp-2011-Computation of Infix Probabilities for Probabilistic Context-Free Grammars

Author: Mark-Jan Nederhof ; Giorgio Satta

Abstract: The notion of infix probability has been introduced in the literature as a generalization of the notion of prefix (or initial substring) probability, motivated by applications in speech recognition and word error correction. For the case where a probabilistic context-free grammar is used as language model, methods for the computation of infix probabilities have been presented in the literature, based on various simplifying assumptions. Here we present a solution that applies to the problem in its full generality.

5 0.5515765 66 emnlp-2011-Hierarchical Phrase-based Translation Representations

Author: Gonzalo Iglesias ; Cyril Allauzen ; William Byrne ; Adria de Gispert ; Michael Riley

Abstract: This paper compares several translation representations for a synchronous context-free grammar parse including CFGs/hypergraphs, finite-state automata (FSA), and pushdown automata (PDA). The representation choice is shown to determine the form and complexity of target LM intersection and shortest-path algorithms that follow. Intersection, shortest path, FSA expansion and RTN replacement algorithms are presented for PDAs. Chinese-toEnglish translation experiments using HiFST and HiPDT, FSA and PDA-based decoders, are presented using admissible (or exact) search, possible for HiFST with compact SCFG rulesets and HiPDT with compact LMs. For large rulesets with large LMs, we introduce a two-pass search strategy which we then analyze in terms of search errors and translation performance.

6 0.44928211 58 emnlp-2011-Fast Generation of Translation Forest for Large-Scale SMT Discriminative Training

7 0.4411107 65 emnlp-2011-Heuristic Search for Non-Bottom-Up Tree Structure Prediction

8 0.43553558 49 emnlp-2011-Entire Relaxation Path for Maximum Entropy Problems

9 0.43058184 108 emnlp-2011-Quasi-Synchronous Phrase Dependency Grammars for Machine Translation

10 0.42276797 128 emnlp-2011-Structured Relation Discovery using Generative Models

11 0.42035505 59 emnlp-2011-Fast and Robust Joint Models for Biomedical Event Extraction

12 0.4100143 92 emnlp-2011-Minimally Supervised Event Causality Identification

13 0.40779564 3 emnlp-2011-A Correction Model for Word Alignments

14 0.40774149 123 emnlp-2011-Soft Dependency Constraints for Reordering in Hierarchical Phrase-Based Translation

15 0.40695173 134 emnlp-2011-Third-order Variational Reranking on Packed-Shared Dependency Forests

16 0.40661165 61 emnlp-2011-Generating Aspect-oriented Multi-Document Summarization with Event-aspect model

17 0.40621355 70 emnlp-2011-Identifying Relations for Open Information Extraction

18 0.40600038 126 emnlp-2011-Structural Opinion Mining for Graph-based Sentiment Representation

19 0.39993116 53 emnlp-2011-Experimental Support for a Categorical Compositional Distributional Model of Meaning

20 0.39983821 116 emnlp-2011-Robust Disambiguation of Named Entities in Text