emnlp emnlp2010 emnlp2010-93 knowledge-graph by maker-knowledge-mining

93 emnlp-2010-Resolving Event Noun Phrases to Their Verbal Mentions


Source: pdf

Author: Bin Chen ; Jian Su ; Chew Lim Tan

Abstract: unkown-abstract

Reference: text


Summary: the most important sentenses genereted by tfidf model

sentIndex sentText sentNum sentScore

1 ProceMedITin,g Ms oasfs thaceh 2u0se1t0ts C,o UnSfAer,e n9c-e1 on O Ectmobpeir ic 2a0l1 M0. [sent-3031, score-0.035]

2 tc ho2d0s10 in A Nsastoucira tlio Lnan fogru Cagoem Ppruotcaetisosninagl, L pinag eusis 8t7ic2s–8 1, $ $ ? [sent-3033, score-0.07]

3 [their first major break]4P "6 P P P P " P P 880 ? [sent-30037, score-0.087]


similar papers computed by tfidf model

tfidf for this paper:

wordName wordTfidf (topN-words)

[('flat', 0.381), ('discovered', 0.291), ('break', 0.267), ('strikes', 0.248), ('struck', 0.248), ('president', 0.237), ('mm', 0.207), ('hit', 0.207), ('bush', 0.207), ('positional', 0.193), ('approved', 0.193), ('chain', 0.185), ('ma', 0.156), ('aa', 0.146), ('air', 0.14), ('today', 0.14), ('np', 0.131), ('bm', 0.124), ('nus', 0.124), ('aab', 0.124), ('george', 0.113), ('characteristics', 0.113), ('sg', 0.112), ('object', 0.104), ('demo', 0.097), ('comp', 0.097), ('event', 0.094), ('characteristic', 0.086), ('major', 0.086), ('decision', 0.083), ('anaphora', 0.082), ('id', 0.068), ('ir', 0.06), ('count', 0.06), ('conventional', 0.058), ('resolution', 0.053), ('applicable', 0.048), ('edu', 0.039), ('features', 0.033), ('verb', 0.033), ('candidate', 0.032), ('grammar', 0.032), ('nsastoucira', 0.016), ('pinag', 0.016), ('tlio', 0.016), ('total', 0.016), ('eusis', 0.015), ('lexical', 0.014), ('case', 0.012), ('cagoem', 0.012), ('ectmobpeir', 0.012), ('fogru', 0.012), ('lnan', 0.012), ('oasfs', 0.012), ('ppruotcaetisosninagl', 0.012), ('thaceh', 0.012), ('ic', 0.011), ('tc', 0.011), ('ms', 0.01), ('feature', 0.01), ('single', 0.008), ('first', 0.001)]

similar papers list:

simIndex simValue paperId paperTitle

same-paper 1 1.0 93 emnlp-2010-Resolving Event Noun Phrases to Their Verbal Mentions

Author: Bin Chen ; Jian Su ; Chew Lim Tan

Abstract: unkown-abstract

2 0.093100101 42 emnlp-2010-Efficient Incremental Decoding for Tree-to-String Translation

Author: Liang Huang ; Haitao Mi

Abstract: Syntax-based translation models should in principle be efficient with polynomially-sized search space, but in practice they are often embarassingly slow, partly due to the cost of language model integration. In this paper we borrow from phrase-based decoding the idea to generate a translation incrementally left-to-right, and show that for tree-to-string models, with a clever encoding of derivation history, this method runs in averagecase polynomial-time in theory, and lineartime with beam search in practice (whereas phrase-based decoding is exponential-time in theory and quadratic-time in practice). Experiments show that, with comparable translation quality, our tree-to-string system (in Python) can run more than 30 times faster than the phrase-based system Moses (in C++).

3 0.052023705 75 emnlp-2010-Lessons Learned in Part-of-Speech Tagging of Conversational Speech

Author: Vladimir Eidelman ; Zhongqiang Huang ; Mary Harper

Abstract: This paper examines tagging models for spontaneous English speech transcripts. We analyze the performance of state-of-the-art tagging models, either generative or discriminative, left-to-right or bidirectional, with or without latent annotations, together with the use of ToBI break indexes and several methods for segmenting the speech transcripts (i.e., conversation side, speaker turn, or humanannotated sentence). Based on these studies, we observe that: (1) bidirectional models tend to achieve better accuracy levels than left-toright models, (2) generative models seem to perform somewhat better than discriminative models on this task, and (3) prosody improves tagging performance of models on conversation sides, but has much less impact on smaller segments. We conclude that, although the use of break indexes can indeed significantly im- prove performance over baseline models without them on conversation sides, tagging accuracy improves more by using smaller segments, for which the impact of the break indexes is marginal.

4 0.046510097 106 emnlp-2010-Top-Down Nearly-Context-Sensitive Parsing

Author: Eugene Charniak

Abstract: We present a new syntactic parser that works left-to-right and top down, thus maintaining a fully-connected parse tree for a few alternative parse hypotheses. All of the commonly used statistical parsers use context-free dynamic programming algorithms and as such work bottom up on the entire sentence. Thus they only find a complete fully connected parse at the very end. In contrast, both subjective and experimental evidence show that people understand a sentence word-to-word as they go along, or close to it. The constraint that the parser keeps one or more fully connected syntactic trees is intended to operationalize this cognitive fact. Our parser achieves a new best result for topdown parsers of 89.4%,a 20% error reduction over the previous single-parser best result for parsers of this type of 86.8% (Roark, 2001) . The improved performance is due to embracing the very large feature set available in exchange for giving up dynamic programming.

5 0.046429984 121 emnlp-2010-What a Parser Can Learn from a Semantic Role Labeler and Vice Versa

Author: Stephen Boxwell ; Dennis Mehay ; Chris Brew

Abstract: In many NLP systems, there is a unidirectional flow of information in which a parser supplies input to a semantic role labeler. In this paper, we build a system that allows information to flow in both directions. We make use of semantic role predictions in choosing a single-best parse. This process relies on an averaged perceptron model to distinguish likely semantic roles from erroneous ones. Our system penalizes parses that give rise to low-scoring semantic roles. To explore the consequences of this we perform two experiments. First, we use a baseline generative model to produce n-best parses, which are then re-ordered by our semantic model. Second, we use a modified version of our semantic role labeler to predict semantic roles at parse time. The performance of this modified labeler is weaker than that of our best full SRL, because it is restricted to features that can be computed directly from the parser’s packed chart. For both experiments, the resulting semantic predictions are then used to select parses. Finally, we feed the selected parses produced by each experiment to the full version of our semantic role labeler. We find that SRL performance can be improved over this baseline by selecting parses with likely semantic roles.

6 0.043526009 14 emnlp-2010-A Tree Kernel-Based Unified Framework for Chinese Zero Anaphora Resolution

7 0.041369345 20 emnlp-2010-Automatic Detection and Classification of Social Events

8 0.040805887 58 emnlp-2010-Holistic Sentiment Analysis Across Languages: Multilingual Supervised Latent Dirichlet Allocation

9 0.03988719 31 emnlp-2010-Constraints Based Taxonomic Relation Classification

10 0.034093928 46 emnlp-2010-Evaluating the Impact of Alternative Dependency Graph Encodings on Solving Event Extraction Tasks

11 0.032991614 65 emnlp-2010-Inducing Probabilistic CCG Grammars from Logical Form with Higher-Order Unification

12 0.030924238 112 emnlp-2010-Unsupervised Discovery of Negative Categories in Lexicon Bootstrapping

13 0.025593653 4 emnlp-2010-A Game-Theoretic Approach to Generating Spatial Descriptions

14 0.025551418 124 emnlp-2010-Word Sense Induction Disambiguation Using Hierarchical Random Graphs

15 0.023171164 36 emnlp-2010-Discriminative Word Alignment with a Function Word Reordering Model

16 0.022891942 17 emnlp-2010-An Efficient Algorithm for Unsupervised Word Segmentation with Branching Entropy and MDL

17 0.022086563 33 emnlp-2010-Cross Language Text Classification by Model Translation and Semi-Supervised Learning

18 0.02037273 118 emnlp-2010-Utilizing Extra-Sentential Context for Parsing

19 0.019957518 98 emnlp-2010-Soft Syntactic Constraints for Hierarchical Phrase-Based Translation Using Latent Syntactic Distributions

20 0.019485462 8 emnlp-2010-A Multi-Pass Sieve for Coreference Resolution


similar papers computed by lsi model

lsi for this paper:

topicId topicWeight

[(0, 0.068), (1, 0.025), (2, 0.046), (3, 0.072), (4, 0.034), (5, -0.002), (6, 0.018), (7, -0.077), (8, -0.056), (9, -0.04), (10, -0.004), (11, -0.109), (12, -0.031), (13, -0.008), (14, -0.006), (15, -0.039), (16, 0.069), (17, 0.058), (18, -0.038), (19, -0.12), (20, -0.043), (21, 0.125), (22, 0.05), (23, 0.071), (24, -0.154), (25, -0.035), (26, 0.011), (27, 0.069), (28, -0.047), (29, -0.074), (30, -0.198), (31, -0.132), (32, 0.001), (33, -0.045), (34, -0.411), (35, -0.167), (36, 0.09), (37, 0.233), (38, -0.167), (39, -0.119), (40, -0.157), (41, 0.076), (42, 0.132), (43, 0.035), (44, -0.193), (45, -0.205), (46, -0.345), (47, -0.133), (48, -0.1), (49, 0.1)]

similar papers list:

simIndex simValue paperId paperTitle

same-paper 1 0.99637133 93 emnlp-2010-Resolving Event Noun Phrases to Their Verbal Mentions

Author: Bin Chen ; Jian Su ; Chew Lim Tan

Abstract: unkown-abstract

2 0.26903582 42 emnlp-2010-Efficient Incremental Decoding for Tree-to-String Translation

Author: Liang Huang ; Haitao Mi

Abstract: Syntax-based translation models should in principle be efficient with polynomially-sized search space, but in practice they are often embarassingly slow, partly due to the cost of language model integration. In this paper we borrow from phrase-based decoding the idea to generate a translation incrementally left-to-right, and show that for tree-to-string models, with a clever encoding of derivation history, this method runs in averagecase polynomial-time in theory, and lineartime with beam search in practice (whereas phrase-based decoding is exponential-time in theory and quadratic-time in practice). Experiments show that, with comparable translation quality, our tree-to-string system (in Python) can run more than 30 times faster than the phrase-based system Moses (in C++).

3 0.22008133 4 emnlp-2010-A Game-Theoretic Approach to Generating Spatial Descriptions

Author: Dave Golland ; Percy Liang ; Dan Klein

Abstract: Language is sensitive to both semantic and pragmatic effects. To capture both effects, we model language use as a cooperative game between two players: a speaker, who generates an utterance, and a listener, who responds with an action. Specifically, we consider the task of generating spatial references to objects, wherein the listener must accurately identify an object described by the speaker. We show that a speaker model that acts optimally with respect to an explicit, embedded listener model substantially outperforms one that is trained to directly generate spatial descriptions.

4 0.19555843 75 emnlp-2010-Lessons Learned in Part-of-Speech Tagging of Conversational Speech

Author: Vladimir Eidelman ; Zhongqiang Huang ; Mary Harper

Abstract: This paper examines tagging models for spontaneous English speech transcripts. We analyze the performance of state-of-the-art tagging models, either generative or discriminative, left-to-right or bidirectional, with or without latent annotations, together with the use of ToBI break indexes and several methods for segmenting the speech transcripts (i.e., conversation side, speaker turn, or humanannotated sentence). Based on these studies, we observe that: (1) bidirectional models tend to achieve better accuracy levels than left-toright models, (2) generative models seem to perform somewhat better than discriminative models on this task, and (3) prosody improves tagging performance of models on conversation sides, but has much less impact on smaller segments. We conclude that, although the use of break indexes can indeed significantly im- prove performance over baseline models without them on conversation sides, tagging accuracy improves more by using smaller segments, for which the impact of the break indexes is marginal.

5 0.18337484 14 emnlp-2010-A Tree Kernel-Based Unified Framework for Chinese Zero Anaphora Resolution

Author: Fang Kong ; Guodong Zhou

Abstract: This paper proposes a unified framework for zero anaphora resolution, which can be divided into three sub-tasks: zero anaphor detection, anaphoricity determination and antecedent identification. In particular, all the three sub-tasks are addressed using tree kernel-based methods with appropriate syntactic parse tree structures. Experimental results on a Chinese zero anaphora corpus show that the proposed tree kernel-based methods significantly outperform the feature-based ones. This indicates the critical role of the structural information in zero anaphora resolution and the necessity of tree kernel-based methods in modeling such structural information. To our best knowledge, this is the first systematic work dealing with all the three sub-tasks in Chinese zero anaphora resolution via a unified framework. Moreover, we release a Chinese zero anaphora corpus of 100 documents, which adds a layer of annotation to the manu- ally-parsed sentences in the Chinese Treebank (CTB) 6.0.

6 0.16166887 46 emnlp-2010-Evaluating the Impact of Alternative Dependency Graph Encodings on Solving Event Extraction Tasks

7 0.14354365 94 emnlp-2010-SCFG Decoding Without Binarization

8 0.1252992 121 emnlp-2010-What a Parser Can Learn from a Semantic Role Labeler and Vice Versa

9 0.12472028 65 emnlp-2010-Inducing Probabilistic CCG Grammars from Logical Form with Higher-Order Unification

10 0.10830767 20 emnlp-2010-Automatic Detection and Classification of Social Events

11 0.1082845 31 emnlp-2010-Constraints Based Taxonomic Relation Classification

12 0.099301338 124 emnlp-2010-Word Sense Induction Disambiguation Using Hierarchical Random Graphs

13 0.099051625 58 emnlp-2010-Holistic Sentiment Analysis Across Languages: Multilingual Supervised Latent Dirichlet Allocation

14 0.095051803 110 emnlp-2010-Turbo Parsers: Dependency Parsing by Approximate Variational Inference

15 0.08875066 106 emnlp-2010-Top-Down Nearly-Context-Sensitive Parsing

16 0.086599648 81 emnlp-2010-Modeling Perspective Using Adaptor Grammars

17 0.0838525 90 emnlp-2010-Positional Language Models for Clinical Information Retrieval

18 0.077022366 9 emnlp-2010-A New Approach to Lexical Disambiguation of Arabic Text

19 0.074969634 112 emnlp-2010-Unsupervised Discovery of Negative Categories in Lexicon Bootstrapping

20 0.069708258 83 emnlp-2010-Multi-Level Structured Models for Document-Level Sentiment Classification


similar papers computed by lda model

lda for this paper:

topicId topicWeight

[(29, 0.032), (57, 0.781), (66, 0.038)]

similar papers list:

simIndex simValue paperId paperTitle

same-paper 1 0.91816187 93 emnlp-2010-Resolving Event Noun Phrases to Their Verbal Mentions

Author: Bin Chen ; Jian Su ; Chew Lim Tan

Abstract: unkown-abstract

2 0.059602425 34 emnlp-2010-Crouching Dirichlet, Hidden Markov Model: Unsupervised POS Tagging with Context Local Tag Generation

Author: Taesun Moon ; Katrin Erk ; Jason Baldridge

Abstract: We define the crouching Dirichlet, hidden Markov model (CDHMM), an HMM for partof-speech tagging which draws state prior distributions for each local document context. This simple modification of the HMM takes advantage of the dichotomy in natural language between content and function words. In contrast, a standard HMM draws all prior distributions once over all states and it is known to perform poorly in unsupervised and semisupervised POS tagging. This modification significantly improves unsupervised POS tagging performance across several measures on five data sets for four languages. We also show that simply using different hyperparameter values for content and function word states in a standard HMM (which we call HMM+) is surprisingly effective.

3 0.059328862 29 emnlp-2010-Combining Unsupervised and Supervised Alignments for MT: An Empirical Study

Author: Jinxi Xu ; Antti-Veikko Rosti

Abstract: Word alignment plays a central role in statistical MT (SMT) since almost all SMT systems extract translation rules from word aligned parallel training data. While most SMT systems use unsupervised algorithms (e.g. GIZA++) for training word alignment, supervised methods, which exploit a small amount of human-aligned data, have become increasingly popular recently. This work empirically studies the performance of these two classes of alignment algorithms and explores strategies to combine them to improve overall system performance. We used two unsupervised aligners, GIZA++ and HMM, and one supervised aligner, ITG, in this study. To avoid language and genre specific conclusions, we ran experiments on test sets consisting of two language pairs (Chinese-to-English and Arabicto-English) and two genres (newswire and weblog). Results show that the two classes of algorithms achieve the same level of MT perfor- mance. Modest improvements were achieved by taking the union of the translation grammars extracted from different alignments. Significant improvements (around 1.0 in BLEU) were achieved by combining outputs of different systems trained with different alignments. The improvements are consistent across languages and genres.

4 0.059034798 114 emnlp-2010-Unsupervised Parse Selection for HPSG

Author: Rebecca Dridan ; Timothy Baldwin

Abstract: Parser disambiguation with precision grammars generally takes place via statistical ranking of the parse yield of the grammar using a supervised parse selection model. In the standard process, the parse selection model is trained over a hand-disambiguated treebank, meaning that without a significant investment of effort to produce the treebank, parse selection is not possible. Furthermore, as treebanking is generally streamlined with parse selection models, creating the initial treebank without a model requires more resources than subsequent treebanks. In this work, we show that, by taking advantage of the constrained nature of these HPSG grammars, we can learn a discriminative parse selection model from raw text in a purely unsupervised fashion. This allows us to bootstrap the treebanking process and provide better parsers faster, and with less resources.

5 0.059012301 99 emnlp-2010-Statistical Machine Translation with a Factorized Grammar

Author: Libin Shen ; Bing Zhang ; Spyros Matsoukas ; Jinxi Xu ; Ralph Weischedel

Abstract: In modern machine translation practice, a statistical phrasal or hierarchical translation system usually relies on a huge set of translation rules extracted from bi-lingual training data. This approach not only results in space and efficiency issues, but also suffers from the sparse data problem. In this paper, we propose to use factorized grammars, an idea widely accepted in the field of linguistic grammar construction, to generalize translation rules, so as to solve these two problems. We designed a method to take advantage of the XTAG English Grammar to facilitate the extraction of factorized rules. We experimented on various setups of low-resource language translation, and showed consistent significant improvement in BLEU over state-ofthe-art string-to-dependency baseline systems with 200K words of bi-lingual training data.

6 0.058978356 7 emnlp-2010-A Mixture Model with Sharing for Lexical Semantics

7 0.058831621 111 emnlp-2010-Two Decades of Unsupervised POS Induction: How Far Have We Come?

8 0.058438417 115 emnlp-2010-Uptraining for Accurate Deterministic Question Parsing

9 0.058199979 97 emnlp-2010-Simple Type-Level Unsupervised POS Tagging

10 0.058155097 109 emnlp-2010-Translingual Document Representations from Discriminative Projections

11 0.058089606 67 emnlp-2010-It Depends on the Translation: Unsupervised Dependency Parsing via Word Alignment

12 0.057985388 43 emnlp-2010-Enhancing Domain Portability of Chinese Segmentation Model Using Chi-Square Statistics and Bootstrapping

13 0.057825275 104 emnlp-2010-The Necessity of Combining Adaptation Methods

14 0.057573315 77 emnlp-2010-Measuring Distributional Similarity in Context

15 0.057333641 96 emnlp-2010-Self-Training with Products of Latent Variable Grammars

16 0.057100132 22 emnlp-2010-Automatic Evaluation of Translation Quality for Distant Language Pairs

17 0.057030499 124 emnlp-2010-Word Sense Induction Disambiguation Using Hierarchical Random Graphs

18 0.056894828 63 emnlp-2010-Improving Translation via Targeted Paraphrasing

19 0.056772165 84 emnlp-2010-NLP on Spoken Documents Without ASR

20 0.056724366 2 emnlp-2010-A Fast Decoder for Joint Word Segmentation and POS-Tagging Using a Single Discriminative Model