acl acl2011 acl2011-342 knowledge-graph by maker-knowledge-mining

342 acl-2011-full-for-print


Source: pdf

Author: Kuzman Ganchev

Abstract: unkown-abstract

Reference: text


Summary: the most important sentenses genereted by tfidf model

sentIndex sentText sentNum sentScore

1 Rich Prior Knowledge in Learning for Natural Language Processing Bibliography For a more up-to-date bibliography as well as additional information about these methods, point your browser to: http ://sideinfo . [sent-1, score-0.194]

2 com/ 1 Constraint-Driven Learning Constraint driven learning (CoDL) was first introduced in Chang et al. [sent-3, score-0.038]

3 A further paper on the topic is in submission [Chang et al. [sent-6, score-0.038]

4 2 Generalized Expectation Generalized Expectation (GE) constraints were first introduced by Mann and McCallum [2007] 1 and were used to incorporate prior knowledge about the label distribution into semi-supervised classification. [sent-8, score-0.117]

5 GE constraints have also been used to leverage “labeled features” in document classification [Druck et al. [sent-9, score-0.072]

6 , 2009b, Bellare and McCallum, 2009] , and to incorporate linguistic prior knowledge into dependency grammar induction [Druck et al. [sent-11, score-0.254]

7 [2008] , and has been applied to dependency grammar induction [Ganchev et al. [sent-16, score-0.209]

8 , 2010] , part of speech induction [Gra c¸a et al. [sent-19, score-0.086]

9 , 2009b] , and cross-lingual semantic alignment [Platt et al. [sent-24, score-0.035]

10 The framework was independently discovered by Bellare et al. [sent-26, score-0.121]

11 [2009] as an approximation to GE constraints, under the name Alternating Projections, and used under that name also by Singh et al. [sent-27, score-0.033]

12 The framework was also independently discovered by Liang et al. [sent-29, score-0.121]

13 [2009] as an approximation to 1In Mann and McCallum [2007] the method was called Expectation Regularization. [sent-30, score-0.033]

14 a Bayesian model motivated by modeling prior information as measurements, and applied to information extraction. [sent-31, score-0.045]

15 [2009] introduce a distribution matching framework very closely related to GE constraints, with the idea that the model should predict the same feature expectations on labeled and undlabeled data for a set of features, formalized as a kernel. [sent-33, score-0.121]

16 [2010] introduce a framework for semi-supervised learning based on constraints, and trained with an iterative update algorithm very similar to CoDL, but introducing only confident constraints as the algorithm progresses. [sent-35, score-0.149]

17 Gupta and Sarawagi [2011] introduce a framework for agreement that is closely related to the PR-based work in Ganchev et al. [sent-36, score-0.084]

18 Generalized expectation criteria for bootstrapping extractors using record-text alignment. [sent-46, score-0.302]

19 Semi-supervised learning of dependency parsers using generalized expectation criteria. [sent-86, score-0.436]

20 Generalized expectation criteria for semisupervised learning of conditional random fields. [sent-176, score-0.307]


similar papers computed by tfidf model

tfidf for this paper:

wordName wordTfidf (topN-words)

[('ganchev', 0.422), ('druck', 0.378), ('expectation', 0.23), ('mann', 0.207), ('gra', 0.191), ('gillenwater', 0.184), ('url', 0.172), ('graa', 0.165), ('joo', 0.165), ('bellare', 0.16), ('graca', 0.134), ('kuzman', 0.13), ('posterior', 0.121), ('mccallum', 0.117), ('chang', 0.116), ('generalized', 0.115), ('bibliography', 0.11), ('codl', 0.11), ('platt', 0.106), ('ge', 0.105), ('ratinov', 0.097), ('quadrianto', 0.09), ('induction', 0.086), ('jennifer', 0.082), ('uai', 0.08), ('gupta', 0.08), ('ben', 0.079), ('singh', 0.076), ('wsdm', 0.074), ('measurements', 0.074), ('naseem', 0.074), ('constraints', 0.072), ('icml', 0.07), ('grammar', 0.07), ('alternating', 0.069), ('projections', 0.065), ('october', 0.065), ('carlson', 0.065), ('gregory', 0.062), ('constrained', 0.056), ('sparsity', 0.056), ('andrew', 0.055), ('dependency', 0.053), ('nips', 0.051), ('roweis', 0.049), ('novi', 0.049), ('translingual', 0.049), ('constraintdriven', 0.049), ('postcat', 0.049), ('ma', 0.048), ('pereira', 0.048), ('discovered', 0.048), ('http', 0.047), ('prior', 0.045), ('sarawagi', 0.045), ('sunita', 0.045), ('csail', 0.045), ('petterson', 0.045), ('aclweb', 0.045), ('closely', 0.045), ('cambridge', 0.043), ('regularization', 0.043), ('pr', 0.043), ('rizzolo', 0.042), ('dustin', 0.042), ('hillard', 0.042), ('february', 0.042), ('fernando', 0.041), ('kedar', 0.04), ('joao', 0.04), ('schuurmans', 0.04), ('burr', 0.04), ('settles', 0.04), ('estevam', 0.04), ('framework', 0.039), ('criteria', 0.039), ('learning', 0.038), ('submission', 0.038), ('williams', 0.038), ('bitext', 0.038), ('aaai', 0.037), ('formalized', 0.037), ('browser', 0.037), ('culotta', 0.037), ('betteridge', 0.037), ('hruschka', 0.037), ('harr', 0.037), ('tahira', 0.037), ('neural', 0.036), ('bengio', 0.035), ('guiding', 0.035), ('alignment', 0.035), ('cm', 0.034), ('lev', 0.034), ('independently', 0.034), ('liang', 0.034), ('rahul', 0.033), ('sameer', 0.033), ('extractors', 0.033), ('approximation', 0.033)]

similar papers list:

simIndex simValue paperId paperTitle

same-paper 1 0.99999976 342 acl-2011-full-for-print

Author: Kuzman Ganchev

Abstract: unkown-abstract

2 0.14117664 103 acl-2011-Domain Adaptation by Constraining Inter-Domain Variability of Latent Feature Representation

Author: Ivan Titov

Abstract: We consider a semi-supervised setting for domain adaptation where only unlabeled data is available for the target domain. One way to tackle this problem is to train a generative model with latent variables on the mixture of data from the source and target domains. Such a model would cluster features in both domains and ensure that at least some of the latent variables are predictive of the label on the source domain. The danger is that these predictive clusters will consist of features specific to the source domain only and, consequently, a classifier relying on such clusters would perform badly on the target domain. We introduce a constraint enforcing that marginal distributions of each cluster (i.e., each latent variable) do not vary significantly across domains. We show that this constraint is effec- tive on the sentiment classification task (Pang et al., 2002), resulting in scores similar to the ones obtained by the structural correspondence methods (Blitzer et al., 2007) without the need to engineer auxiliary tasks.

3 0.12790252 92 acl-2011-Data point selection for cross-language adaptation of dependency parsers

Author: Anders Sgaard

Abstract: We consider a very simple, yet effective, approach to cross language adaptation of dependency parsers. We first remove lexical items from the treebanks and map part-of-speech tags into a common tagset. We then train a language model on tag sequences in otherwise unlabeled target data and rank labeled source data by perplexity per word of tag sequences from less similar to most similar to the target. We then train our target language parser on the most similar data points in the source labeled data. The strategy achieves much better results than a non-adapted baseline and stateof-the-art unsupervised dependency parsing, and results are comparable to more complex projection-based cross language adaptation algorithms.

4 0.11266032 170 acl-2011-In-domain Relation Discovery with Meta-constraints via Posterior Regularization

Author: Harr Chen ; Edward Benson ; Tahira Naseem ; Regina Barzilay

Abstract: We present a novel approach to discovering relations and their instantiations from a collection of documents in a single domain. Our approach learns relation types by exploiting meta-constraints that characterize the general qualities of a good relation in any domain. These constraints state that instances of a single relation should exhibit regularities at multiple levels of linguistic structure, including lexicography, syntax, and document-level context. We capture these regularities via the structure of our probabilistic model as well as a set of declaratively-specified constraints enforced during posterior inference. Across two domains our approach successfully recovers hidden relation structure, comparable to or outperforming previous state-of-the-art approaches. Furthermore, we find that a small , set of constraints is applicable across the domains, and that using domain-specific constraints can further improve performance. 1

5 0.087486617 243 acl-2011-Partial Parsing from Bitext Projections

Author: Prashanth Mannem ; Aswarth Dara

Abstract: Recent work has shown how a parallel corpus can be leveraged to build syntactic parser for a target language by projecting automatic source parse onto the target sentence using word alignments. The projected target dependency parses are not always fully connected to be useful for training traditional dependency parsers. In this paper, we present a greedy non-directional parsing algorithm which doesn’t need a fully connected parse and can learn from partial parses by utilizing available structural and syntactic information in them. Our parser achieved statistically significant improvements over a baseline system that trains on only fully connected parses for Bulgarian, Spanish and Hindi. It also gave a significant improvement over previously reported results for Bulgarian and set a benchmark for Hindi.

6 0.076442845 15 acl-2011-A Hierarchical Pitman-Yor Process HMM for Unsupervised Part of Speech Induction

7 0.067425348 230 acl-2011-Neutralizing Linguistically Problematic Annotations in Unsupervised Dependency Parsing Evaluation

8 0.063878253 221 acl-2011-Model-Based Aligner Combination Using Dual Decomposition

9 0.06262701 323 acl-2011-Unsupervised Part-of-Speech Tagging with Bilingual Graph-Based Projections

10 0.06097322 117 acl-2011-Entity Set Expansion using Topic information

11 0.056386251 196 acl-2011-Large-Scale Cross-Document Coreference Using Distributed Inference and Hierarchical Models

12 0.055077799 265 acl-2011-Reordering Modeling using Weighted Alignment Matrices

13 0.050997108 246 acl-2011-Piggyback: Using Search Engines for Robust Cross-Domain Named Entity Recognition

14 0.049552642 189 acl-2011-K-means Clustering with Feature Hashing

15 0.049214032 139 acl-2011-From Bilingual Dictionaries to Interlingual Document Representations

16 0.04500578 178 acl-2011-Interactive Topic Modeling

17 0.043836817 190 acl-2011-Knowledge-Based Weak Supervision for Information Extraction of Overlapping Relations

18 0.043641865 57 acl-2011-Bayesian Word Alignment for Statistical Machine Translation

19 0.041348148 79 acl-2011-Confidence Driven Unsupervised Semantic Parsing

20 0.040280156 325 acl-2011-Unsupervised Word Alignment with Arbitrary Features


similar papers computed by lsi model

lsi for this paper:

topicId topicWeight

[(0, 0.109), (1, -0.0), (2, -0.029), (3, -0.023), (4, 0.015), (5, -0.032), (6, 0.011), (7, 0.055), (8, -0.033), (9, 0.048), (10, 0.086), (11, 0.04), (12, 0.053), (13, 0.026), (14, 0.009), (15, 0.04), (16, -0.019), (17, -0.041), (18, -0.044), (19, -0.041), (20, -0.045), (21, -0.046), (22, -0.005), (23, 0.019), (24, -0.01), (25, 0.019), (26, -0.031), (27, -0.002), (28, 0.065), (29, -0.006), (30, 0.014), (31, 0.111), (32, 0.056), (33, 0.041), (34, 0.033), (35, -0.012), (36, 0.016), (37, 0.072), (38, -0.056), (39, -0.028), (40, 0.018), (41, -0.027), (42, 0.003), (43, -0.047), (44, -0.076), (45, -0.119), (46, -0.056), (47, 0.053), (48, 0.013), (49, -0.019)]

similar papers list:

simIndex simValue paperId paperTitle

same-paper 1 0.92027688 342 acl-2011-full-for-print

Author: Kuzman Ganchev

Abstract: unkown-abstract

2 0.63139158 103 acl-2011-Domain Adaptation by Constraining Inter-Domain Variability of Latent Feature Representation

Author: Ivan Titov

Abstract: We consider a semi-supervised setting for domain adaptation where only unlabeled data is available for the target domain. One way to tackle this problem is to train a generative model with latent variables on the mixture of data from the source and target domains. Such a model would cluster features in both domains and ensure that at least some of the latent variables are predictive of the label on the source domain. The danger is that these predictive clusters will consist of features specific to the source domain only and, consequently, a classifier relying on such clusters would perform badly on the target domain. We introduce a constraint enforcing that marginal distributions of each cluster (i.e., each latent variable) do not vary significantly across domains. We show that this constraint is effec- tive on the sentiment classification task (Pang et al., 2002), resulting in scores similar to the ones obtained by the structural correspondence methods (Blitzer et al., 2007) without the need to engineer auxiliary tasks.

3 0.56863618 92 acl-2011-Data point selection for cross-language adaptation of dependency parsers

Author: Anders Sgaard

Abstract: We consider a very simple, yet effective, approach to cross language adaptation of dependency parsers. We first remove lexical items from the treebanks and map part-of-speech tags into a common tagset. We then train a language model on tag sequences in otherwise unlabeled target data and rank labeled source data by perplexity per word of tag sequences from less similar to most similar to the target. We then train our target language parser on the most similar data points in the source labeled data. The strategy achieves much better results than a non-adapted baseline and stateof-the-art unsupervised dependency parsing, and results are comparable to more complex projection-based cross language adaptation algorithms.

4 0.5329836 190 acl-2011-Knowledge-Based Weak Supervision for Information Extraction of Overlapping Relations

Author: Raphael Hoffmann ; Congle Zhang ; Xiao Ling ; Luke Zettlemoyer ; Daniel S. Weld

Abstract: Information extraction (IE) holds the promise of generating a large-scale knowledge base from the Web’s natural language text. Knowledge-based weak supervision, using structured data to heuristically label a training corpus, works towards this goal by enabling the automated learning of a potentially unbounded number of relation extractors. Recently, researchers have developed multiinstance learning algorithms to combat the noisy training data that can come from heuristic labeling, but their models assume relations are disjoint — for example they cannot extract the pair Founded ( Jobs Apple ) and CEO-o f ( Jobs Apple ) . , , This paper presents a novel approach for multi-instance learning with overlapping relations that combines a sentence-level extrac- , tion model with a simple, corpus-level component for aggregating the individual facts. We apply our model to learn extractors for NY Times text using weak supervision from Freebase. Experiments show that the approach runs quickly and yields surprising gains in accuracy, at both the aggregate and sentence level.

5 0.52799809 323 acl-2011-Unsupervised Part-of-Speech Tagging with Bilingual Graph-Based Projections

Author: Dipanjan Das ; Slav Petrov

Abstract: We describe a novel approach for inducing unsupervised part-of-speech taggers for languages that have no labeled training data, but have translated text in a resource-rich language. Our method does not assume any knowledge about the target language (in particular no tagging dictionary is assumed), making it applicable to a wide array of resource-poor languages. We use graph-based label propagation for cross-lingual knowledge transfer and use the projected labels as features in an unsupervised model (BergKirkpatrick et al., 2010). Across eight European languages, our approach results in an average absolute improvement of 10.4% over a state-of-the-art baseline, and 16.7% over vanilla hidden Markov models induced with the Expectation Maximization algorithm.

6 0.50790012 243 acl-2011-Partial Parsing from Bitext Projections

7 0.49923879 295 acl-2011-Temporal Restricted Boltzmann Machines for Dependency Parsing

8 0.46606344 170 acl-2011-In-domain Relation Discovery with Meta-constraints via Posterior Regularization

9 0.46171117 303 acl-2011-Tier-based Strictly Local Constraints for Phonology

10 0.46094102 150 acl-2011-Hierarchical Text Classification with Latent Concepts

11 0.44373918 278 acl-2011-Semi-supervised condensed nearest neighbor for part-of-speech tagging

12 0.44230512 221 acl-2011-Model-Based Aligner Combination Using Dual Decomposition

13 0.4354299 121 acl-2011-Event Discovery in Social Media Feeds

14 0.43400434 335 acl-2011-Why Initialization Matters for IBM Model 1: Multiple Optima and Non-Strict Convexity

15 0.41702771 179 acl-2011-Is Machine Translation Ripe for Cross-Lingual Sentiment Classification?

16 0.4092907 15 acl-2011-A Hierarchical Pitman-Yor Process HMM for Unsupervised Part of Speech Induction

17 0.40530184 239 acl-2011-P11-5002 k2opt.pdf

18 0.40310854 234 acl-2011-Optimal Head-Driven Parsing Complexity for Linear Context-Free Rewriting Systems

19 0.4009726 230 acl-2011-Neutralizing Linguistically Problematic Annotations in Unsupervised Dependency Parsing Evaluation

20 0.4009212 123 acl-2011-Exact Decoding of Syntactic Translation Models through Lagrangian Relaxation


similar papers computed by lda model

lda for this paper:

topicId topicWeight

[(17, 0.023), (26, 0.01), (31, 0.023), (37, 0.09), (39, 0.038), (41, 0.094), (53, 0.014), (55, 0.035), (59, 0.029), (72, 0.031), (77, 0.392), (91, 0.036), (96, 0.092)]

similar papers list:

simIndex simValue paperId paperTitle

same-paper 1 0.75323004 342 acl-2011-full-for-print

Author: Kuzman Ganchev

Abstract: unkown-abstract

2 0.64639616 223 acl-2011-Modeling Wisdom of Crowds Using Latent Mixture of Discriminative Experts

Author: Derya Ozkan ; Louis-Philippe Morency

Abstract: In many computational linguistic scenarios, training labels are subjectives making it necessary to acquire the opinions of multiple annotators/experts, which is referred to as ”wisdom of crowds”. In this paper, we propose a new approach for modeling wisdom of crowds based on the Latent Mixture of Discriminative Experts (LMDE) model that can automatically learn the prototypical patterns and hidden dynamic among different experts. Experiments show improvement over state-of-the-art approaches on the task of listener backchannel prediction in dyadic conversations.

3 0.41357511 92 acl-2011-Data point selection for cross-language adaptation of dependency parsers

Author: Anders Sgaard

Abstract: We consider a very simple, yet effective, approach to cross language adaptation of dependency parsers. We first remove lexical items from the treebanks and map part-of-speech tags into a common tagset. We then train a language model on tag sequences in otherwise unlabeled target data and rank labeled source data by perplexity per word of tag sequences from less similar to most similar to the target. We then train our target language parser on the most similar data points in the source labeled data. The strategy achieves much better results than a non-adapted baseline and stateof-the-art unsupervised dependency parsing, and results are comparable to more complex projection-based cross language adaptation algorithms.

4 0.40152603 190 acl-2011-Knowledge-Based Weak Supervision for Information Extraction of Overlapping Relations

Author: Raphael Hoffmann ; Congle Zhang ; Xiao Ling ; Luke Zettlemoyer ; Daniel S. Weld

Abstract: Information extraction (IE) holds the promise of generating a large-scale knowledge base from the Web’s natural language text. Knowledge-based weak supervision, using structured data to heuristically label a training corpus, works towards this goal by enabling the automated learning of a potentially unbounded number of relation extractors. Recently, researchers have developed multiinstance learning algorithms to combat the noisy training data that can come from heuristic labeling, but their models assume relations are disjoint — for example they cannot extract the pair Founded ( Jobs Apple ) and CEO-o f ( Jobs Apple ) . , , This paper presents a novel approach for multi-instance learning with overlapping relations that combines a sentence-level extrac- , tion model with a simple, corpus-level component for aggregating the individual facts. We apply our model to learn extractors for NY Times text using weak supervision from Freebase. Experiments show that the approach runs quickly and yields surprising gains in accuracy, at both the aggregate and sentence level.

5 0.39022803 65 acl-2011-Can Document Selection Help Semi-supervised Learning? A Case Study On Event Extraction

Author: Shasha Liao ; Ralph Grishman

Abstract: Annotating training data for event extraction is tedious and labor-intensive. Most current event extraction tasks rely on hundreds of annotated documents, but this is often not enough. In this paper, we present a novel self-training strategy, which uses Information Retrieval (IR) to collect a cluster of related documents as the resource for bootstrapping. Also, based on the particular characteristics of this corpus, global inference is applied to provide more confident and informative data selection. We compare this approach to self-training on a normal newswire corpus and show that IR can provide a better corpus for bootstrapping and that global inference can further improve instance selection. We obtain gains of 1.7% in trigger labeling and 2.3% in role labeling through IR and an additional 1.1% in trigger labeling and 1.3% in role labeling by applying global inference. 1

6 0.3893913 126 acl-2011-Exploiting Syntactico-Semantic Structures for Relation Extraction

7 0.38673103 185 acl-2011-Joint Identification and Segmentation of Domain-Specific Dialogue Acts for Conversational Dialogue Systems

8 0.38450533 196 acl-2011-Large-Scale Cross-Document Coreference Using Distributed Inference and Hierarchical Models

9 0.38431662 58 acl-2011-Beam-Width Prediction for Efficient Context-Free Parsing

10 0.38430661 128 acl-2011-Exploring Entity Relations for Named Entity Disambiguation

11 0.38206828 324 acl-2011-Unsupervised Semantic Role Induction via Split-Merge Clustering

12 0.37887153 119 acl-2011-Evaluating the Impact of Coder Errors on Active Learning

13 0.37818465 316 acl-2011-Unary Constraints for Efficient Context-Free Parsing

14 0.37767661 172 acl-2011-Insertion, Deletion, or Substitution? Normalizing Text Messages without Pre-categorization nor Supervision

15 0.37686956 209 acl-2011-Lexically-Triggered Hidden Markov Models for Clinical Document Coding

16 0.37623852 246 acl-2011-Piggyback: Using Search Engines for Robust Cross-Domain Named Entity Recognition

17 0.37619391 111 acl-2011-Effects of Noun Phrase Bracketing in Dependency Parsing and Machine Translation

18 0.37573701 40 acl-2011-An Error Analysis of Relation Extraction in Social Media Documents

19 0.37549174 295 acl-2011-Temporal Restricted Boltzmann Machines for Dependency Parsing

20 0.37534988 73 acl-2011-Collective Classification of Congressional Floor-Debate Transcripts