acl acl2010 acl2010-250 acl2010-250-reference knowledge-graph by maker-knowledge-mining

250 acl-2010-Untangling the Cross-Lingual Link Structure of Wikipedia


Source: pdf

Author: Gerard de Melo ; Gerhard Weikum

Abstract: Wikipedia articles in different languages are connected by interwiki links that are increasingly being recognized as a valuable source of cross-lingual information. Unfortunately, large numbers of links are imprecise or simply wrong. In this paper, techniques to detect such problems are identified. We formalize their removal as an optimization task based on graph repair operations. We then present an algorithm with provable properties that uses linear programming and a region growing technique to tackle this challenge. This allows us to transform Wikipedia into a much more consistent multilingual register of the world’s entities and concepts.


reference text