acl acl2013 acl2013-382 knowledge-graph by maker-knowledge-mining
Source: pdf
Author: David Burkett ; Dan Klein
Abstract: unkown-abstract
Reference: text
sentIndex sentText sentNum sentScore
1 Variational Inference for Structured NLP Models David Burkett and Dan Klein Computer Science Division University of California, Berkeley {dburkett , kle in} @ c s . [sent-1, score-0.09]
2 edu Description Historically, key breakthroughs in structured NLP models, such as chain CRFs or PCFGs, have relied on imposing careful constraints on the locality of features in order to permit efficient dynamic programming for computing expectations or finding the highest-scoring structures. [sent-3, score-0.618]
3 In the NLP community, one increasingly popular approach is the use of variational methods for computing approximate distributions. [sent-5, score-0.581]
4 The goal of the tutorial is to provide an introduction to variational methods for approximate in- ference, particularly mean field approximation and belief propagation. [sent-6, score-1.299]
5 Though the full derivations can be somewhat tedious, the resulting procedures are quite straightforward, and typically consist of an iterative process of individually updating specific components of the model, conditioned on the rest. [sent-8, score-0.244]
6 Although we will provide some theoretical background, the main goal of the tutorial is to provide a concrete procedural guide to using these approximate inference techniques, illustrated with detailed walkthroughs of examples from recent NLP literature. [sent-9, score-0.651]
7 Once both variational inference procedures have been described in detail, we’ll provide a summary comparison of the two, along with some intuition about which approach is appropriate when. [sent-10, score-0.721]
8 We’ll also provide a guide to further exploration of the topic, briefly discussing other variational techniques, such as expectation propagation and convex relaxations, but concentrating mainly on providing pointers to additional resources for those who wish to learn more. [sent-11, score-0.892]
9 Structured Models and Factor Graphs • Factor graph notation • Example structured NLP models • Inference 2. [sent-13, score-0.197]
10 Mean Field • Warmup (iterated conditional modes) • Mean field procedure • Derivation of mean field update • Example 3. [sent-14, score-0.468]
11 Structured Mean Field • Structured approximation • Computing structured updates • Example: Joint parsing and alignment 4. [sent-15, score-0.574]
12 Belief Propagation • Intro Messages and beliefs • Loopy BP 5. [sent-16, score-0.077]
13 Structured Belief Propagation • • sWagaerms)up (efficient products for mesExample: Word alignment • Example: Dependency parsing 6. [sent-17, score-0.138]
14 Wrap-Up • Mean field vs BP • Other approximation techniques • 9 ProceediSnogfsia o,f B thuelg 5a1rista, A Annuugauslt M 4-e9eti 2n0g1 o3f. [sent-18, score-0.441]
15 hc e2 A01ss3o Acisastoiocnia ftoiorn C foorm Cpoumtaptiuotnaatilo Lnianlg Luiinstgiucsis,pt iacgses 9–10, Presenter Bios David Burkett is a postdoctoral researcher in the Computer Science Division at the University of California, Berkeley. [sent-20, score-0.372]
16 His interests are diverse, though, and he has worked on parsing, phrase alignment, language evolution, coreference resolution, and even video game AI. [sent-22, score-0.29]
17 He has worked as an instructional assistant for multiple AI courses at Berkeley and won multiple Outstanding Graduate Student Instructor awards. [sent-23, score-0.42]
18 His research includes many areas of statistical natural language processing, including grammar induction, parsing, machine translation, information extraction, document summarization, historical linguistics, and speech recognition. [sent-25, score-0.053]
19 His academic awards include a Sloan Fellowship, a Microsoft Faculty Fellowship, an NSF CAREER Award, the ACM Grace Murray Hopper Award, Best Paper Awards at ACL, EMNLP and NAACL, and the UC Berkeley Distinguished Teaching Award. [sent-26, score-0.206]
wordName wordTfidf (topN-words)
[('variational', 0.368), ('structured', 0.197), ('approximation', 0.189), ('field', 0.168), ('burkett', 0.161), ('belief', 0.156), ('awards', 0.154), ('berkeley', 0.144), ('inference', 0.137), ('bp', 0.135), ('mean', 0.132), ('fellowship', 0.132), ('propagation', 0.129), ('california', 0.127), ('approximate', 0.12), ('division', 0.118), ('worked', 0.118), ('ll', 0.113), ('award', 0.113), ('tutorial', 0.111), ('hopper', 0.097), ('procedures', 0.093), ('courses', 0.09), ('murray', 0.09), ('kle', 0.09), ('presenter', 0.09), ('loopy', 0.09), ('historically', 0.09), ('procedural', 0.09), ('postdoctoral', 0.084), ('acisastoiocnia', 0.084), ('iacgses', 0.084), ('lnianlg', 0.084), ('thuelg', 0.084), ('sloan', 0.084), ('professor', 0.084), ('guide', 0.083), ('derivation', 0.08), ('beliefs', 0.077), ('ftoiorn', 0.077), ('instructional', 0.077), ('computable', 0.077), ('assistant', 0.077), ('pointers', 0.077), ('tedious', 0.077), ('iterated', 0.074), ('pcfgs', 0.074), ('teaching', 0.074), ('ley', 0.074), ('impractical', 0.072), ('ference', 0.072), ('locality', 0.072), ('outstanding', 0.072), ('discussing', 0.072), ('alignment', 0.071), ('career', 0.069), ('grace', 0.069), ('intuition', 0.068), ('imposing', 0.068), ('parsing', 0.067), ('klein', 0.066), ('relaxations', 0.066), ('uc', 0.066), ('modes', 0.066), ('minimizes', 0.066), ('nlp', 0.065), ('researcher', 0.064), ('hc', 0.063), ('permit', 0.063), ('interests', 0.061), ('careful', 0.06), ('expectations', 0.058), ('resort', 0.058), ('won', 0.058), ('evolution', 0.058), ('game', 0.057), ('convex', 0.057), ('factor', 0.055), ('provide', 0.055), ('video', 0.054), ('relied', 0.053), ('historical', 0.053), ('impossible', 0.053), ('academic', 0.052), ('distinguished', 0.052), ('outline', 0.052), ('individually', 0.052), ('messages', 0.051), ('crfs', 0.051), ('wish', 0.051), ('derivations', 0.05), ('updates', 0.05), ('updating', 0.049), ('pruning', 0.049), ('graduate', 0.049), ('faculty', 0.048), ('beam', 0.047), ('computing', 0.047), ('increasingly', 0.046), ('seek', 0.046)]
simIndex simValue paperId paperTitle
same-paper 1 0.99999976 382 acl-2013-Variational Inference for Structured NLP Models
Author: David Burkett ; Dan Klein
Abstract: unkown-abstract
2 0.09828683 106 acl-2013-Decentralized Entity-Level Modeling for Coreference Resolution
Author: Greg Durrett ; David Hall ; Dan Klein
Abstract: Efficiently incorporating entity-level information is a challenge for coreference resolution systems due to the difficulty of exact inference over partitions. We describe an end-to-end discriminative probabilistic model for coreference that, along with standard pairwise features, enforces structural agreement constraints between specified properties of coreferent mentions. This model can be represented as a factor graph for each document that admits efficient inference via belief propagation. We show that our method can use entity-level information to outperform a basic pairwise system.
3 0.093920171 313 acl-2013-Semantic Parsing with Combinatory Categorial Grammars
Author: Yoav Artzi ; Nicholas FitzGerald ; Luke Zettlemoyer
Abstract: unkown-abstract
4 0.093292363 349 acl-2013-The mathematics of language learning
Author: Andras Kornai ; Gerald Penn ; James Rogers ; Anssi Yli-Jyra
Abstract: unkown-abstract
5 0.080639854 210 acl-2013-Joint Word Alignment and Bilingual Named Entity Recognition Using Dual Decomposition
Author: Mengqiu Wang ; Wanxiang Che ; Christopher D. Manning
Abstract: Translated bi-texts contain complementary language cues, and previous work on Named Entity Recognition (NER) has demonstrated improvements in performance over monolingual taggers by promoting agreement of tagging decisions between the two languages. However, most previous approaches to bilingual tagging assume word alignments are given as fixed input, which can cause cascading errors. We observe that NER label information can be used to correct alignment mistakes, and present a graphical model that performs bilingual NER tagging jointly with word alignment, by combining two monolingual tagging models with two unidirectional alignment models. We intro- duce additional cross-lingual edge factors that encourage agreements between tagging and alignment decisions. We design a dual decomposition inference algorithm to perform joint decoding over the combined alignment and NER output space. Experiments on the OntoNotes dataset demonstrate that our method yields significant improvements in both NER and word alignment over state-of-the-art monolingual baselines.
6 0.078299314 237 acl-2013-Margin-based Decomposed Amortized Inference
7 0.075796723 251 acl-2013-Mr. MIRA: Open-Source Large-Margin Structured Learning on MapReduce
8 0.070151664 191 acl-2013-Improved Bayesian Logistic Supervised Topic Models with Data Augmentation
9 0.069315307 190 acl-2013-Implicatures and Nested Beliefs in Approximate Decentralized-POMDPs
10 0.067313597 173 acl-2013-Graph-based Semi-Supervised Model for Joint Chinese Word Segmentation and Part-of-Speech Tagging
11 0.064507537 348 acl-2013-The effect of non-tightness on Bayesian estimation of PCFGs
12 0.063745387 269 acl-2013-PLIS: a Probabilistic Lexical Inference System
13 0.062233817 132 acl-2013-Easy-First POS Tagging and Dependency Parsing with Beam Search
14 0.06144033 143 acl-2013-Exact Maximum Inference for the Fertility Hidden Markov Model
15 0.056381736 44 acl-2013-An Empirical Examination of Challenges in Chinese Parsing
16 0.056323532 358 acl-2013-Transition-based Dependency Parsing with Selectional Branching
17 0.054125108 108 acl-2013-Decipherment
18 0.053923897 226 acl-2013-Learning to Prune: Context-Sensitive Pruning for Syntactic MT
19 0.053854462 9 acl-2013-A Lightweight and High Performance Monolingual Word Aligner
20 0.052419707 291 acl-2013-Question Answering Using Enhanced Lexical Semantic Models
topicId topicWeight
[(0, 0.137), (1, -0.031), (2, -0.039), (3, -0.01), (4, -0.024), (5, 0.024), (6, 0.056), (7, -0.007), (8, -0.07), (9, -0.018), (10, -0.017), (11, -0.071), (12, -0.015), (13, -0.093), (14, 0.036), (15, -0.033), (16, -0.02), (17, 0.052), (18, -0.003), (19, -0.029), (20, -0.033), (21, -0.001), (22, -0.031), (23, -0.002), (24, -0.014), (25, 0.039), (26, -0.09), (27, 0.084), (28, 0.117), (29, -0.017), (30, 0.063), (31, 0.004), (32, 0.008), (33, 0.042), (34, 0.026), (35, 0.027), (36, -0.01), (37, 0.037), (38, 0.076), (39, 0.015), (40, -0.018), (41, -0.021), (42, 0.017), (43, -0.022), (44, 0.023), (45, 0.038), (46, 0.023), (47, -0.007), (48, 0.005), (49, -0.043)]
simIndex simValue paperId paperTitle
same-paper 1 0.96207976 382 acl-2013-Variational Inference for Structured NLP Models
Author: David Burkett ; Dan Klein
Abstract: unkown-abstract
2 0.68479532 143 acl-2013-Exact Maximum Inference for the Fertility Hidden Markov Model
Author: Chris Quirk
Abstract: The notion of fertility in word alignment (the number of words emitted by a single state) is useful but difficult to model. Initial attempts at modeling fertility used heuristic search methods. Recent approaches instead use more principled approximate inference techniques such as Gibbs sampling for parameter estimation. Yet in practice we also need the single best alignment, which is difficult to find using Gibbs. Building on recent advances in dual decomposition, this paper introduces an exact algorithm for finding the single best alignment with a fertility HMM. Finding the best alignment appears important, as this model leads to a substantial improvement in alignment quality.
3 0.64471859 260 acl-2013-Nonconvex Global Optimization for Latent-Variable Models
Author: Matthew R. Gormley ; Jason Eisner
Abstract: Many models in NLP involve latent variables, such as unknown parses, tags, or alignments. Finding the optimal model parameters is then usually a difficult nonconvex optimization problem. The usual practice is to settle for local optimization methods such as EM or gradient ascent. We explore how one might instead search for a global optimum in parameter space, using branch-and-bound. Our method would eventually find the global maximum (up to a user-specified ?) if run for long enough, but at any point can return a suboptimal solution together with an upper bound on the global maximum. As an illustrative case, we study a generative model for dependency parsing. We search for the maximum-likelihood model parameters and corpus parse, subject to posterior constraints. We show how to formulate this as a mixed integer quadratic programming problem with nonlinear constraints. We use the Reformulation Linearization Technique to produce convex relaxations during branch-and-bound. Although these techniques do not yet provide a practical solution to our instance of this NP-hard problem, they sometimes find better solutions than Viterbi EM with random restarts, in the same time.
4 0.64024037 237 acl-2013-Margin-based Decomposed Amortized Inference
Author: Gourab Kundu ; Vivek Srikumar ; Dan Roth
Abstract: Given that structured output prediction is typically performed over entire datasets, one natural question is whether it is possible to re-use computation from earlier inference instances to speed up inference for future instances. Amortized inference has been proposed as a way to accomplish this. In this paper, first, we introduce a new amortized inference algorithm called the Margin-based Amortized Inference, which uses the notion of structured margin to identify inference problems for which previous solutions are provably optimal. Second, we introduce decomposed amortized inference, which is designed to address very large inference problems, where earlier amortization methods become less ef- fective. This approach works by decomposing the output structure and applying amortization piece-wise, thus increasing the chance that we can re-use previous solutions for parts of the output structure. These parts are then combined to a global coherent solution using Lagrangian relaxation. In our experiments, using the NLP tasks of semantic role labeling and entityrelation extraction, we demonstrate that with the margin-based algorithm, we need to call the inference engine only for a third of the test examples. Further, we show that the decomposed variant of margin-based amortized inference achieves a greater reduction in the number of inference calls.
5 0.59335655 210 acl-2013-Joint Word Alignment and Bilingual Named Entity Recognition Using Dual Decomposition
Author: Mengqiu Wang ; Wanxiang Che ; Christopher D. Manning
Abstract: Translated bi-texts contain complementary language cues, and previous work on Named Entity Recognition (NER) has demonstrated improvements in performance over monolingual taggers by promoting agreement of tagging decisions between the two languages. However, most previous approaches to bilingual tagging assume word alignments are given as fixed input, which can cause cascading errors. We observe that NER label information can be used to correct alignment mistakes, and present a graphical model that performs bilingual NER tagging jointly with word alignment, by combining two monolingual tagging models with two unidirectional alignment models. We intro- duce additional cross-lingual edge factors that encourage agreements between tagging and alignment decisions. We design a dual decomposition inference algorithm to perform joint decoding over the combined alignment and NER output space. Experiments on the OntoNotes dataset demonstrate that our method yields significant improvements in both NER and word alignment over state-of-the-art monolingual baselines.
6 0.5811612 334 acl-2013-Supervised Model Learning with Feature Grouping based on a Discrete Constraint
7 0.57531476 362 acl-2013-Turning on the Turbo: Fast Third-Order Non-Projective Turbo Parsers
8 0.55698234 354 acl-2013-Training Nondeficient Variants of IBM-3 and IBM-4 for Word Alignment
9 0.55559206 106 acl-2013-Decentralized Entity-Level Modeling for Coreference Resolution
10 0.52377528 370 acl-2013-Unsupervised Transcription of Historical Documents
11 0.51689422 190 acl-2013-Implicatures and Nested Beliefs in Approximate Decentralized-POMDPs
12 0.49601176 349 acl-2013-The mathematics of language learning
13 0.49460596 175 acl-2013-Grounded Language Learning from Video Described with Sentences
14 0.47815046 157 acl-2013-Fast and Robust Compressive Summarization with Dual Decomposition and Multi-Task Learning
15 0.47540998 228 acl-2013-Leveraging Domain-Independent Information in Semantic Parsing
16 0.46546209 196 acl-2013-Improving pairwise coreference models through feature space hierarchy learning
17 0.46125442 259 acl-2013-Non-Monotonic Sentence Alignment via Semisupervised Learning
18 0.45851338 313 acl-2013-Semantic Parsing with Combinatory Categorial Grammars
19 0.45646343 36 acl-2013-Adapting Discriminative Reranking to Grounded Language Learning
20 0.45471755 269 acl-2013-PLIS: a Probabilistic Lexical Inference System
topicId topicWeight
[(0, 0.069), (6, 0.062), (11, 0.053), (24, 0.024), (26, 0.027), (31, 0.038), (35, 0.076), (42, 0.079), (48, 0.054), (69, 0.232), (70, 0.077), (88, 0.075), (90, 0.026), (95, 0.033)]
simIndex simValue paperId paperTitle
same-paper 1 0.85508525 382 acl-2013-Variational Inference for Structured NLP Models
Author: David Burkett ; Dan Klein
Abstract: unkown-abstract
2 0.73598659 266 acl-2013-PAL: A Chatterbot System for Answering Domain-specific Questions
Author: Yuanchao Liu ; Ming Liu ; Xiaolong Wang ; Limin Wang ; Jingjing Li
Abstract: In this paper, we propose PAL, a prototype chatterbot for answering non-obstructive psychological domain-specific questions. This system focuses on providing primary suggestions or helping people relieve pressure by extracting knowledge from online forums, based on which the chatterbot system is constructed. The strategies used by PAL, including semantic-extension-based question matching, solution management with personal information consideration, and XML-based knowledge pattern construction, are described and discussed. We also conduct a primary test for the feasibility of our system.
3 0.59387678 367 acl-2013-Universal Conceptual Cognitive Annotation (UCCA)
Author: Omri Abend ; Ari Rappoport
Abstract: Syntactic structures, by their nature, reflect first and foremost the formal constructions used for expressing meanings. This renders them sensitive to formal variation both within and across languages, and limits their value to semantic applications. We present UCCA, a novel multi-layered framework for semantic representation that aims to accommodate the semantic distinctions expressed through linguistic utterances. We demonstrate UCCA’s portability across domains and languages, and its relative insensitivity to meaning-preserving syntactic variation. We also show that UCCA can be effectively and quickly learned by annotators with no linguistic background, and describe the compilation of a UCCAannotated corpus.
4 0.58589828 83 acl-2013-Collective Annotation of Linguistic Resources: Basic Principles and a Formal Model
Author: Ulle Endriss ; Raquel Fernandez
Abstract: Crowdsourcing, which offers new ways of cheaply and quickly gathering large amounts of information contributed by volunteers online, has revolutionised the collection of labelled data. Yet, to create annotated linguistic resources from this data, we face the challenge of having to combine the judgements of a potentially large group of annotators. In this paper we investigate how to aggregate individual annotations into a single collective annotation, taking inspiration from the field of social choice theory. We formulate a general formal model for collective annotation and propose several aggregation methods that go beyond the commonly used majority rule. We test some of our methods on data from a crowdsourcing experiment on textual entailment annotation.
5 0.58435845 225 acl-2013-Learning to Order Natural Language Texts
Author: Jiwei Tan ; Xiaojun Wan ; Jianguo Xiao
Abstract: Ordering texts is an important task for many NLP applications. Most previous works on summary sentence ordering rely on the contextual information (e.g. adjacent sentences) of each sentence in the source document. In this paper, we investigate a more challenging task of ordering a set of unordered sentences without any contextual information. We introduce a set of features to characterize the order and coherence of natural language texts, and use the learning to rank technique to determine the order of any two sentences. We also propose to use the genetic algorithm to determine the total order of all sentences. Evaluation results on a news corpus show the effectiveness of our proposed method. 1
6 0.58366477 56 acl-2013-Argument Inference from Relevant Event Mentions in Chinese Argument Extraction
7 0.58323777 252 acl-2013-Multigraph Clustering for Unsupervised Coreference Resolution
8 0.57857931 275 acl-2013-Parsing with Compositional Vector Grammars
9 0.57336724 70 acl-2013-Bilingually-Guided Monolingual Dependency Grammar Induction
10 0.57089984 111 acl-2013-Density Maximization in Context-Sense Metric Space for All-words WSD
11 0.56857204 133 acl-2013-Efficient Implementation of Beam-Search Incremental Parsers
12 0.56687653 85 acl-2013-Combining Intra- and Multi-sentential Rhetorical Parsing for Document-level Discourse Analysis
13 0.56670463 331 acl-2013-Stop-probability estimates computed on a large corpus improve Unsupervised Dependency Parsing
14 0.56616247 22 acl-2013-A Structured Distributional Semantic Model for Event Co-reference
15 0.56604701 249 acl-2013-Models of Semantic Representation with Visual Attributes
16 0.56493711 18 acl-2013-A Sentence Compression Based Framework to Query-Focused Multi-Document Summarization
17 0.56485426 46 acl-2013-An Infinite Hierarchical Bayesian Model of Phrasal Translation
18 0.56402004 155 acl-2013-Fast and Accurate Shift-Reduce Constituent Parsing
19 0.56394762 172 acl-2013-Graph-based Local Coherence Modeling
20 0.56354231 157 acl-2013-Fast and Robust Compressive Summarization with Dual Decomposition and Multi-Task Learning