emnlp emnlp2012 emnlp2012-45 knowledge-graph by maker-knowledge-mining
Source: pdf
Author: Junsheng Zhou ; Weiguang Qu ; Fen Zhang
Abstract: Most existing systems solved the phrase chunking task with the sequence labeling approaches, in which the chunk candidates cannot be treated as a whole during parsing process so that the chunk-level features cannot be exploited in a natural way. In this paper, we formulate phrase chunking as a joint segmentation and labeling task. We propose an efficient dynamic programming algorithm with pruning for decoding, which allows the direct use of the features describing the internal characteristics of chunk and the features capturing the correlations between adjacent chunks. A relaxed, online maximum margin training algorithm is used for learning. Within this framework, we explored a variety of effective feature representations for Chinese phrase chunking. The experimental results show that the use of chunk-level features can lead to significant performance improvement, and that our approach achieves state-of-the-art performance. In particular, our approach is much better at recognizing long and complicated phrases. 1
Reference: text
sentIndex sentText sentNum sentScore
1 com 2 Abstract Most existing systems solved the phrase chunking task with the sequence labeling approaches, in which the chunk candidates cannot be treated as a whole during parsing process so that the chunk-level features cannot be exploited in a natural way. [sent-5, score-1.541]
2 In this paper, we formulate phrase chunking as a joint segmentation and labeling task. [sent-6, score-0.964]
3 We propose an efficient dynamic programming algorithm with pruning for decoding, which allows the direct use of the features describing the internal characteristics of chunk and the features capturing the correlations between adjacent chunks. [sent-7, score-0.85]
4 1 Introduction Phrase chunking is a Natural Language Processing task that consists in dividing a text into syntactically correlated parts of words. [sent-12, score-0.625]
5 , a word can only be a member of one chunk (Abney, 1991). [sent-15, score-0.631]
6 Generally speaking, there are two phrase chunking tasks, including text chunking (shallow parsing), 557 and noun phrase (NP) chunking. [sent-16, score-1.36]
7 Phrase chunking provides a key feature that helps on more elaborated NLP tasks such as parsing, semantic role tagging and information extraction. [sent-17, score-0.625]
8 There is a wide range of research work on phrase chunking based on machine learning approaches. [sent-18, score-0.68]
9 However, most of the previous work reduced phrase chunking to sequence labeling problems either by using the classification models, such as SVM (Kudo and Matsumoto, 2001), Winnow and voted-perceptrons (Zhang et al. [sent-19, score-0.868]
10 Firstly, these models cannot treat globally a sequence of continuous words as a chunk candidate, and thus cannot inspect the internal structure of the candidate, which is an important aspect of information in modeling phrase chunking. [sent-22, score-0.76]
11 In particular, it makes impossible the use of local indicator function features of the type "the chunk consists of POS tag sequence p1. [sent-23, score-0.751]
12 For instance, the chunk candidate " 『 生命(Life) 区 (Forbidden Zone) ” is considered to be an invalid chunk. [sent-34, score-0.652]
13 But it is easy to check this kind of punctuation matching in a single chunk by introducing a chunklevel feature. [sent-35, score-0.631]
14 Secondly, the sequence labeling models cannot capture the correlations between adjacent chunks, which should be informative for the identification of chunk boundaries and types. [sent-36, score-0.933]
15 In particular, we find that some headwords in the sentence are 禁 expected to have a stronger dependency relation with their preceding headwords in preceding chunks than with their immediately preceding words within the same chunk. [sent-37, score-0.286]
16 In summary, the inherent deficiency in applying the sequence labeling approaches to phrase chunking is that the chunk-level features one would expect to be very informative cannot be exploited in a natural way. [sent-39, score-0.89]
17 The experimental results on Chinese chunking corpus as well as English chunking corpus show that the use of chunk-level features can lead to significant performance improvement, 558 and that our approach performs better than other approaches based on the sequence labeling models. [sent-42, score-1.46]
18 2 Related Work In recent years, many chunking systems based on machine learning approaches have been presented. [sent-43, score-0.625]
19 To accommodate multiple overlapping features on observations, some other approaches view the phrase chunking as a sequence of classification problems, including support vector machines (SVMs) (Kudo and Matsumoto 2001) and a variety of other classifiers (Zhang et al. [sent-47, score-0.744]
20 Some similar approaches based on classifiers or sequence labeling models were also used for Chinese chunking (Li et al. [sent-54, score-0.813]
21 (2006) conducted an empirical study of Chinese chunking on a corpus, which was extracted from UPENN Chinese Treebank-4 (CTB4). [sent-59, score-0.625]
22 In this paper, we model phrase chunking with a joint segmentation and labeling approach, which offer advantages over previous learning methods by explicitly incorporating the internal structural feature and the correlations between the adjacent chunks. [sent-61, score-1.09]
23 2006) so that a more direct comparison with state-of-the-art systems for Chinese chunking would be possible. [sent-68, score-0.625]
24 There are 12 types of chunks: ADJP, ADVP, CLP, DNP, DP, DVP, LCP, LST, NP, PP, QP and VP in the chunking corpus (Xue et al. [sent-69, score-0.625]
25 2 Sequence Labeling Approaches to Phrase Chunking The standard approach to phrase chunking is to use tagging techniques with a BIO tag set. [sent-74, score-0.704]
26 With the data representation like the S2, the problem of phrase chunking can be reduced to a sequence labeling task. [sent-77, score-0.868]
27 The joint model considers all possible chunk boundaries and corresponding chunk types in the sentence, and chooses the overall best output. [sent-80, score-1.287]
28 After one chunk is found, parser move on and search for next possible chunk. [sent-82, score-0.631]
29 Given a sentence x, let y denote an output tagged with chunk types, and GEN a function that enumerates a set of segmentation and labeling candidates GEN(x) for x. [sent-83, score-0.903]
30 The main advantage of the joint segmentation and labeling approach to phrase chunking is to allow for integrating both the internal structural features and the correlations between the adjacent chunks for prediction. [sent-91, score-1.253]
31 The search space of combined candidates in the joint segmentation and labeling task is very large, which is an exponential growth in the number of possible candidates with increasing sentence size. [sent-97, score-0.285]
32 The rate of growth is O(2nTn) for the joint system, where n is the length of the sentence and T is the number of chunk types. [sent-98, score-0.702]
33 In other words, we assume that the chunk ci and the corresponding label ti are only associated with the preceding chunk ci-1 and the label ti-1. [sent-104, score-1.329]
34 Suppose that the input sentence has n words and the constant M is the maximum chunk length in the training corpus. [sent-105, score-0.656]
35 Let V(b,e,t) denote the highest-scored segmentation and labeling with the last chunk starting at word index b, ending at word index e and the last chunk type being t. [sent-106, score-1.591]
36 n-1, and all possible chunk type t, respectively, and then pick the highest-scored one from these candidates. [sent-109, score-0.663]
37 In order to compute V(b,n-1,t), the last chunk needs to be combined with all possible different segmentations of words (b-M). [sent-110, score-0.649]
38 b-1 and all possible different chunk types so that the highestscored can be selected. [sent-112, score-0.631]
39 b-1 and all possible chunk types with the last chunk being word b¢ . [sent-115, score-1.262]
40 b1 and the last chunk type being t¢ will also give the highest score when combined with the word b. [sent-117, score-0.663]
41 M-1 , and each possible chunk type t, are solved in straightforward manner. [sent-122, score-0.663]
42 And the final highest-scored segmentation and labeling can be 560 found by solving all subproblems in a bottom-up fashion. [sent-123, score-0.27]
43 It works by filling an n by n by T table chart, where n is the number of words in the input sentence sent, and T is the number of chunk types. [sent-125, score-0.631]
44 chunk type index t for the current chunk; chunk type index t¢ for the previous chunk; Initialization: for e = 0. [sent-138, score-1.384]
45 T: chart[0,e,t] ←single chunk sent[0,e] and type t Algorithm: for e = 0. [sent-142, score-0.663]
46 T: chart[b,e,t]←the highest scored segmentation and labeling among those derived by combining chart[p,b- 1, ] with sent[b,e] and chunk type t, for p = (b-M). [sent-148, score-0.902]
47 2 Pruning The time complexity of the above algorithm is O(M2T2n), where M is the maximum chunk size. [sent-160, score-0.631]
48 Firstly, we collect chunk type transition information between chunk types by observing every pair of adjacent chunks in the training corpus, and record a chunk type transition matrix. [sent-165, score-2.235]
49 For example, from the Chinese Treebank that we used for our experiments, a transition from chunk type ADJP to ADVP does not occur in the training corpus, the corresponding matrix element is set to false, true otherwise. [sent-166, score-0.683]
50 During decoding, the chunk type transition information is used to prune unlikely combinations between current chunk and the preceding chunk by their chunk types. [sent-167, score-2.599]
51 Secondly, a POS tag dictionary is used to record POS tags associated with each chunk type. [sent-168, score-0.7]
52 Specifically, for each chunk type, we record all POS tags appearing in this type of chunk in the training corpus. [sent-169, score-1.339]
53 During decoding, a segment of continuous words that contains only allowed POS tags according to the POS tag dictionary will be considered to be a valid chunk candidate. [sent-170, score-0.707]
54 Finally, the system records the maximum number of words for each type of chunk in the training corpus. [sent-171, score-0.663]
55 The few chunk types that are seen with length bigger than ten are NP, QP and ADJP. [sent-173, score-0.656]
56 During decoding, the chunk candidate whose length is greater than the maximum chunk length associated with its chunk type will be discarded. [sent-174, score-1.975]
57 2 Loss Function For the joint segmentation and labeling task, there are two alternative loss functions: 0-1 loss and F1 loss. [sent-200, score-0.4]
58 The most common loss function for joint segmentation and labeling problems is F1 measure over chunks. [sent-202, score-0.332]
59 This is the geometric mean of precision and recall over the (properly-labeled) chunk identification task, defined as follows. [sent-203, score-0.631]
60 3 Features Table 1 shows the feature templates for the joint segmentation and labeling model. [sent-209, score-0.31]
61 In the row for feature templates, c, t, w and p are used to represent a chunk, a chunk type, a word and a POS tag, respectively. [sent-210, score-0.631]
62 And c0 and c−1 represent the current chunk and the previous chunk respectively. [sent-211, score-1.262]
63 In Table 1, templates 1-4 are SL-type features, where label(w) denotes the label indicating the position of the word w in the current chunk; len(c) denotes the length of chunk c. [sent-216, score-0.778]
64 For example, given an NP chunk "北京(Beijing) 机场 (Airport)", which includes two words, the value of label("北京") is "B" and the value of label("机场") is "I". [sent-217, score-0.631]
65 Template specitermMatch(c) is used to check the punctuation matching within chunk c for the special terms, as illustrated in section 1. [sent-220, score-0.631]
66 Secondly, in our model, we have a chance to treat the chunk candidate as a whole during decoding, which means that we can employ more expressive features in our model than in the sequence labeling models. [sent-221, score-0.841]
67 In Table 1, templates 513 concern the Internal-type features, where start_word(c) and end_word(c) represent the first word and the last word of chunk c, respectively. [sent-222, score-0.677]
68 Similarly, start_POS(c) and end_POS(c) represent the POS tags associated with the first word and the last word of chunk c, respectively. [sent-223, score-0.657]
69 These features aim at expressing the formation patterns of the current chunk with respect to words and POS tags. [sent-224, score-0.653]
70 Template internalWords(c) denotes the concatenation of words in chunk c, while internalPOSs(c) denotes the sequence of POS tags in chunk c using regular expression-like form, as illustrated in section 1. [sent-225, score-1.384]
71 Finally, in Table 1, templates 14-28 concern the Correlation-type features, where head(c) denotes the headword extracted from chunk c, and headPOS(c) denotes the POS tag associated with the headword in chunk c. [sent-226, score-1.546]
72 For example, we extracted the headwords located in adjacent chunks to form headword bigrams to express semantic dependency between adjacent chunks. [sent-228, score-0.402]
73 1 Experiments Data Sets and Evaluation Following previous studies on Chinese chunking in (Chen et al. [sent-233, score-0.625]
74 The standard evaluation metrics for this task are precision p (the fraction of output chunks matching the reference chunks), recall r (the fraction of reference chunks returned), and the F-measure given by F = 2pr/(p + r). [sent-242, score-0.282]
75 2 Chinese NP chunking NP is the most important phrase in Chinese chunking and about 47% phrases in the CTB4 Corpus are NPs. [sent-247, score-1.305]
76 3 Chinese Text Chunking There are 12 different types of phrases in the chunking corpus. [sent-258, score-0.625]
77 In this section, we give a comparison and analysis between our model and other state-of-theart machine learning models for Chinese NP chunking and text chunking tasks. [sent-267, score-1.25]
78 Observing the results in Table 4, we can see that for both NP chunking and text chunking tasks, our model achieves significant performance improvement over those state-of-the-art systems in terms of the F1-score, even for the voting methods. [sent-271, score-1.29]
79 For text 564 chunking task, our approach improves performance by 0. [sent-272, score-0.625]
80 In particular, for NP chunking task, the F1-score of our approach is improved by 0. [sent-276, score-0.625]
81 Figure 3 shows the comparison of F1-scores of the two systems by the chunk length. [sent-280, score-0.631]
82 In the Chinese chunking corpus, the max NP length is 27, and the mean NP length is 1. [sent-281, score-0.675]
83 From the figure, we can see that the performance gap grows rapidly with the increase of the chunk length. [sent-287, score-0.631]
84 But the gap begins to become smaller with further growth of the chunk length. [sent-290, score-0.652]
85 、 、 r35ec687F9so-45 5 5oSuVrMsy tem 1 2 3 4 5 6 7 The length of NP 8 >8 Figure 3: Comparison of F1-scores of NP recognition on Chinese corpus by the chunk length. [sent-299, score-0.656]
86 5 Impact of Different Types of Features Our phrase chunking model is highly dependent upon chunk-level information. [sent-301, score-0.68]
87 Then, adding the Internal-type features to the system results in significant performance improvement on NP chunking and on text chunking, achieving 2. [sent-307, score-0.647]
88 Further, if Correlation-type features are used, the F1-scores on NP chunking and on text chunking are improved by 1. [sent-310, score-1.272]
89 The results show a significant impact due to the use of Internal-type features and Correlation-type features for both NP chunking and text chunking. [sent-313, score-0.669]
90 6 Performance on Other Languages We mainly focused on Chinese chunking in this paper. [sent-316, score-0.625]
91 To validate this point, we evaluated our system on the CoNLL 2000 data set, a public benchmarking corpus for English chunking (Sang and Buchholz 2000). [sent-318, score-0.652]
92 We conducted both the NP-chunking and text chunking experiments on this data set with our approach, using the same feature templates as in Chinese chunking task excluding template 13. [sent-320, score-1.296]
93 Table 6 also shows state-of-the-art performance for both NP-chunking and text chunking tasks. [sent-323, score-0.625]
94 , 2008) are the stateof-the-art for the NP chunking task, and SVM's results presented in (Wu et al. [sent-325, score-0.625]
95 , 2006) are the state- of-the-art for the text chunking task. [sent-326, score-0.625]
96 Moreover, the performance should be further improved if some additional features tailored for English chunking are employed in our model. [sent-327, score-0.647]
97 4 F1723 4 7 Conclusions and Future Work In this paper we have presented a novel approach to phrase chunking by formulating it as a joint segmentation and labeling problem. [sent-333, score-0.944]
98 The experimental results on both Chinese chunking and English chunking tasks show that the use of chunk-level features can lead to significant performance improvement and that our approach outperforms the best in the literature. [sent-335, score-1.272]
99 Chinese chunk identification using svms single discriminative model. [sent-433, score-0.69]
100 A general and multi-lingual phrase chunking model based on masking method. [sent-448, score-0.68]
wordName wordTfidf (topN-words)
[('chunk', 0.631), ('chunking', 0.625), ('labeling', 0.146), ('chunks', 0.141), ('np', 0.12), ('chinese', 0.094), ('segmentation', 0.093), ('headword', 0.08), ('loss', 0.068), ('chart', 0.062), ('adjacent', 0.061), ('svms', 0.059), ('phrase', 0.055), ('shallow', 0.054), ('chtb', 0.053), ('correlations', 0.053), ('qp', 0.048), ('nps', 0.046), ('mira', 0.046), ('templates', 0.046), ('pos', 0.045), ('sequence', 0.042), ('voting', 0.04), ('xt', 0.04), ('molina', 0.04), ('tianshun', 0.04), ('headwords', 0.038), ('yt', 0.038), ('decoding', 0.037), ('mcdonald', 0.037), ('recognized', 0.036), ('crfs', 0.036), ('kudo', 0.034), ('tagged', 0.033), ('type', 0.032), ('internal', 0.032), ('subproblems', 0.031), ('advp', 0.031), ('sang', 0.031), ('sent', 0.03), ('pruning', 0.029), ('index', 0.029), ('adjp', 0.029), ('sha', 0.029), ('chen', 0.027), ('denotes', 0.027), ('benchmarking', 0.027), ('bestk', 0.027), ('dvp', 0.027), ('fid', 0.027), ('geanx', 0.027), ('ldcrf', 0.027), ('mbl', 0.027), ('nanjing', 0.027), ('yongmei', 0.027), ('firstly', 0.027), ('tan', 0.027), ('tags', 0.026), ('segment', 0.026), ('wt', 0.026), ('joint', 0.025), ('zhang', 0.025), ('length', 0.025), ('fields', 0.024), ('tag', 0.024), ('abney', 0.023), ('argy', 0.023), ('jiangsu', 0.023), ('pla', 0.023), ('preceding', 0.023), ('features', 0.022), ('label', 0.022), ('growth', 0.021), ('online', 0.021), ('zone', 0.021), ('forbidden', 0.021), ('invalid', 0.021), ('tbl', 0.021), ('located', 0.021), ('crammer', 0.02), ('secondly', 0.02), ('pereira', 0.02), ('transition', 0.02), ('parsing', 0.02), ('formulate', 0.02), ('hmms', 0.019), ('airport', 0.019), ('cardinality', 0.019), ('economic', 0.019), ('jingbo', 0.019), ('qing', 0.019), ('segmented', 0.019), ('record', 0.019), ('vp', 0.018), ('segmentations', 0.018), ('bikel', 0.018), ('punctuations', 0.018), ('conditional', 0.018), ('beijing', 0.018), ('observing', 0.017)]
simIndex simValue paperId paperTitle
same-paper 1 0.99999934 45 emnlp-2012-Exploiting Chunk-level Features to Improve Phrase Chunking
Author: Junsheng Zhou ; Weiguang Qu ; Fen Zhang
Abstract: Most existing systems solved the phrase chunking task with the sequence labeling approaches, in which the chunk candidates cannot be treated as a whole during parsing process so that the chunk-level features cannot be exploited in a natural way. In this paper, we formulate phrase chunking as a joint segmentation and labeling task. We propose an efficient dynamic programming algorithm with pruning for decoding, which allows the direct use of the features describing the internal characteristics of chunk and the features capturing the correlations between adjacent chunks. A relaxed, online maximum margin training algorithm is used for learning. Within this framework, we explored a variety of effective feature representations for Chinese phrase chunking. The experimental results show that the use of chunk-level features can lead to significant performance improvement, and that our approach achieves state-of-the-art performance. In particular, our approach is much better at recognizing long and complicated phrases. 1
2 0.12383451 70 emnlp-2012-Joint Chinese Word Segmentation, POS Tagging and Parsing
Author: Xian Qian ; Yang Liu
Abstract: In this paper, we propose a novel decoding algorithm for discriminative joint Chinese word segmentation, part-of-speech (POS) tagging, and parsing. Previous work often used a pipeline method Chinese word segmentation followed by POS tagging and parsing, which suffers from error propagation and is unable to leverage information in later modules for earlier components. In our approach, we train the three individual models separately during training, and incorporate them together in a unified framework during decoding. We extend the CYK parsing algorithm so that it can deal with word segmentation and POS tagging features. As far as we know, this is the first work on joint Chinese word segmentation, POS tagging and parsing. Our experimental results on Chinese Tree Bank 5 corpus show that our approach outperforms the state-of-the-art pipeline system. –
3 0.1177851 67 emnlp-2012-Inducing a Discriminative Parser to Optimize Machine Translation Reordering
Author: Graham Neubig ; Taro Watanabe ; Shinsuke Mori
Abstract: This paper proposes a method for learning a discriminative parser for machine translation reordering using only aligned parallel text. This is done by treating the parser’s derivation tree as a latent variable in a model that is trained to maximize reordering accuracy. We demonstrate that efficient large-margin training is possible by showing that two measures of reordering accuracy can be factored over the parse tree. Using this model in the pre-ordering framework results in significant gains in translation accuracy over standard phrasebased SMT and previously proposed unsupervised syntax induction methods.
4 0.099400915 106 emnlp-2012-Part-of-Speech Tagging for Chinese-English Mixed Texts with Dynamic Features
Author: Jiayi Zhao ; Xipeng Qiu ; Shu Zhang ; Feng Ji ; Xuanjing Huang
Abstract: In modern Chinese articles or conversations, it is very popular to involve a few English words, especially in emails and Internet literature. Therefore, it becomes an important and challenging topic to analyze Chinese-English mixed texts. The underlying problem is how to tag part-of-speech (POS) for the English words involved. Due to the lack of specially annotated corpus, most of the English words are tagged as the oversimplified type, “foreign words”. In this paper, we present a method using dynamic features to tag POS of mixed texts. Experiments show that our method achieves higher performance than traditional sequence labeling methods. Meanwhile, our method also boosts the performance of POS tagging for pure Chinese texts.
5 0.088936746 131 emnlp-2012-Unified Dependency Parsing of Chinese Morphological and Syntactic Structures
Author: Zhongguo Li ; Guodong Zhou
Abstract: Most previous approaches to syntactic parsing of Chinese rely on a preprocessing step of word segmentation, thereby assuming there was a clearly defined boundary between morphology and syntax in Chinese. We show how this assumption can fail badly, leading to many out-of-vocabulary words and incompatible annotations. Hence in practice the strict separation of morphology and syntax in the Chinese language proves to be untenable. We present a unified dependency parsing approach for Chinese which takes unsegmented sentences as input and outputs both morphological and syntactic structures with a single model and algorithm. By removing the intermediate word segmentation, the unified parser no longer needs separate notions for words and phrases. Evaluation proves the effectiveness of the unified model and algorithm in parsing structures of words, phrases and sen- tences simultaneously. 1
6 0.084988184 7 emnlp-2012-A Novel Discriminative Framework for Sentence-Level Discourse Analysis
7 0.069335058 12 emnlp-2012-A Transition-Based System for Joint Part-of-Speech Tagging and Labeled Non-Projective Dependency Parsing
8 0.065716609 105 emnlp-2012-Parser Showdown at the Wall Street Corral: An Empirical Investigation of Error Types in Parser Output
9 0.063809812 57 emnlp-2012-Generalized Higher-Order Dependency Parsing with Cube Pruning
10 0.060954113 127 emnlp-2012-Transforming Trees to Improve Syntactic Convergence
11 0.054893959 11 emnlp-2012-A Systematic Comparison of Phrase Table Pruning Techniques
12 0.052375831 51 emnlp-2012-Extracting Opinion Expressions with semi-Markov Conditional Random Fields
13 0.049981434 68 emnlp-2012-Iterative Annotation Transformation with Predict-Self Reestimation for Chinese Word Segmentation
14 0.045631409 64 emnlp-2012-Improved Parsing and POS Tagging Using Inter-Sentence Consistency Constraints
15 0.044916324 42 emnlp-2012-Entropy-based Pruning for Phrase-based Machine Translation
16 0.043959375 21 emnlp-2012-Assessment of ESL Learners' Syntactic Competence Based on Similarity Measures
17 0.042949978 82 emnlp-2012-Left-to-Right Tree-to-String Decoding with Prediction
18 0.041483492 3 emnlp-2012-A Coherence Model Based on Syntactic Patterns
19 0.040985994 129 emnlp-2012-Type-Supervised Hidden Markov Models for Part-of-Speech Tagging with Incomplete Tag Dictionaries
20 0.038863167 24 emnlp-2012-Biased Representation Learning for Domain Adaptation
topicId topicWeight
[(0, 0.156), (1, -0.126), (2, 0.068), (3, -0.034), (4, 0.056), (5, -0.027), (6, -0.044), (7, -0.044), (8, -0.076), (9, -0.115), (10, -0.118), (11, -0.09), (12, -0.022), (13, -0.047), (14, 0.036), (15, 0.058), (16, 0.018), (17, -0.007), (18, 0.001), (19, -0.055), (20, 0.024), (21, 0.079), (22, -0.012), (23, -0.118), (24, 0.046), (25, 0.019), (26, 0.121), (27, 0.01), (28, 0.073), (29, -0.078), (30, -0.044), (31, 0.014), (32, -0.045), (33, 0.001), (34, 0.103), (35, -0.107), (36, -0.088), (37, 0.157), (38, 0.019), (39, -0.048), (40, -0.075), (41, -0.011), (42, 0.123), (43, 0.046), (44, -0.026), (45, -0.031), (46, 0.051), (47, -0.023), (48, -0.159), (49, -0.001)]
simIndex simValue paperId paperTitle
same-paper 1 0.93028331 45 emnlp-2012-Exploiting Chunk-level Features to Improve Phrase Chunking
Author: Junsheng Zhou ; Weiguang Qu ; Fen Zhang
Abstract: Most existing systems solved the phrase chunking task with the sequence labeling approaches, in which the chunk candidates cannot be treated as a whole during parsing process so that the chunk-level features cannot be exploited in a natural way. In this paper, we formulate phrase chunking as a joint segmentation and labeling task. We propose an efficient dynamic programming algorithm with pruning for decoding, which allows the direct use of the features describing the internal characteristics of chunk and the features capturing the correlations between adjacent chunks. A relaxed, online maximum margin training algorithm is used for learning. Within this framework, we explored a variety of effective feature representations for Chinese phrase chunking. The experimental results show that the use of chunk-level features can lead to significant performance improvement, and that our approach achieves state-of-the-art performance. In particular, our approach is much better at recognizing long and complicated phrases. 1
2 0.57370579 70 emnlp-2012-Joint Chinese Word Segmentation, POS Tagging and Parsing
Author: Xian Qian ; Yang Liu
Abstract: In this paper, we propose a novel decoding algorithm for discriminative joint Chinese word segmentation, part-of-speech (POS) tagging, and parsing. Previous work often used a pipeline method Chinese word segmentation followed by POS tagging and parsing, which suffers from error propagation and is unable to leverage information in later modules for earlier components. In our approach, we train the three individual models separately during training, and incorporate them together in a unified framework during decoding. We extend the CYK parsing algorithm so that it can deal with word segmentation and POS tagging features. As far as we know, this is the first work on joint Chinese word segmentation, POS tagging and parsing. Our experimental results on Chinese Tree Bank 5 corpus show that our approach outperforms the state-of-the-art pipeline system. –
3 0.54233861 7 emnlp-2012-A Novel Discriminative Framework for Sentence-Level Discourse Analysis
Author: Shafiq Joty ; Giuseppe Carenini ; Raymond Ng
Abstract: We propose a complete probabilistic discriminative framework for performing sentencelevel discourse analysis. Our framework comprises a discourse segmenter, based on a binary classifier, and a discourse parser, which applies an optimal CKY-like parsing algorithm to probabilities inferred from a Dynamic Conditional Random Field. We show on two corpora that our approach outperforms the state-of-the-art, often by a wide margin.
4 0.50797856 106 emnlp-2012-Part-of-Speech Tagging for Chinese-English Mixed Texts with Dynamic Features
Author: Jiayi Zhao ; Xipeng Qiu ; Shu Zhang ; Feng Ji ; Xuanjing Huang
Abstract: In modern Chinese articles or conversations, it is very popular to involve a few English words, especially in emails and Internet literature. Therefore, it becomes an important and challenging topic to analyze Chinese-English mixed texts. The underlying problem is how to tag part-of-speech (POS) for the English words involved. Due to the lack of specially annotated corpus, most of the English words are tagged as the oversimplified type, “foreign words”. In this paper, we present a method using dynamic features to tag POS of mixed texts. Experiments show that our method achieves higher performance than traditional sequence labeling methods. Meanwhile, our method also boosts the performance of POS tagging for pure Chinese texts.
5 0.42429811 131 emnlp-2012-Unified Dependency Parsing of Chinese Morphological and Syntactic Structures
Author: Zhongguo Li ; Guodong Zhou
Abstract: Most previous approaches to syntactic parsing of Chinese rely on a preprocessing step of word segmentation, thereby assuming there was a clearly defined boundary between morphology and syntax in Chinese. We show how this assumption can fail badly, leading to many out-of-vocabulary words and incompatible annotations. Hence in practice the strict separation of morphology and syntax in the Chinese language proves to be untenable. We present a unified dependency parsing approach for Chinese which takes unsegmented sentences as input and outputs both morphological and syntactic structures with a single model and algorithm. By removing the intermediate word segmentation, the unified parser no longer needs separate notions for words and phrases. Evaluation proves the effectiveness of the unified model and algorithm in parsing structures of words, phrases and sen- tences simultaneously. 1
6 0.4127669 67 emnlp-2012-Inducing a Discriminative Parser to Optimize Machine Translation Reordering
7 0.38065636 31 emnlp-2012-Cross-Lingual Language Modeling with Syntactic Reordering for Low-Resource Speech Recognition
8 0.36864629 64 emnlp-2012-Improved Parsing and POS Tagging Using Inter-Sentence Consistency Constraints
9 0.36259076 55 emnlp-2012-Forest Reranking through Subtree Ranking
10 0.30565014 3 emnlp-2012-A Coherence Model Based on Syntactic Patterns
11 0.29629737 105 emnlp-2012-Parser Showdown at the Wall Street Corral: An Empirical Investigation of Error Types in Parser Output
12 0.29444641 57 emnlp-2012-Generalized Higher-Order Dependency Parsing with Cube Pruning
13 0.29385582 46 emnlp-2012-Exploiting Reducibility in Unsupervised Dependency Parsing
14 0.28737709 88 emnlp-2012-Minimal Dependency Length in Realization Ranking
15 0.2843135 21 emnlp-2012-Assessment of ESL Learners' Syntactic Competence Based on Similarity Measures
16 0.27633855 12 emnlp-2012-A Transition-Based System for Joint Part-of-Speech Tagging and Labeled Non-Projective Dependency Parsing
17 0.26905477 68 emnlp-2012-Iterative Annotation Transformation with Predict-Self Reestimation for Chinese Word Segmentation
18 0.26885617 10 emnlp-2012-A Statistical Relational Learning Approach to Identifying Evidence Based Medicine Categories
19 0.26609492 51 emnlp-2012-Extracting Opinion Expressions with semi-Markov Conditional Random Fields
20 0.25040105 11 emnlp-2012-A Systematic Comparison of Phrase Table Pruning Techniques
topicId topicWeight
[(2, 0.018), (11, 0.012), (16, 0.069), (25, 0.014), (29, 0.011), (34, 0.106), (41, 0.01), (44, 0.249), (60, 0.101), (63, 0.045), (64, 0.027), (65, 0.018), (70, 0.038), (73, 0.01), (74, 0.065), (76, 0.054), (80, 0.014), (86, 0.017), (95, 0.014)]
simIndex simValue paperId paperTitle
same-paper 1 0.77347308 45 emnlp-2012-Exploiting Chunk-level Features to Improve Phrase Chunking
Author: Junsheng Zhou ; Weiguang Qu ; Fen Zhang
Abstract: Most existing systems solved the phrase chunking task with the sequence labeling approaches, in which the chunk candidates cannot be treated as a whole during parsing process so that the chunk-level features cannot be exploited in a natural way. In this paper, we formulate phrase chunking as a joint segmentation and labeling task. We propose an efficient dynamic programming algorithm with pruning for decoding, which allows the direct use of the features describing the internal characteristics of chunk and the features capturing the correlations between adjacent chunks. A relaxed, online maximum margin training algorithm is used for learning. Within this framework, we explored a variety of effective feature representations for Chinese phrase chunking. The experimental results show that the use of chunk-level features can lead to significant performance improvement, and that our approach achieves state-of-the-art performance. In particular, our approach is much better at recognizing long and complicated phrases. 1
2 0.58628821 123 emnlp-2012-Syntactic Transfer Using a Bilingual Lexicon
Author: Greg Durrett ; Adam Pauls ; Dan Klein
Abstract: We consider the problem of using a bilingual dictionary to transfer lexico-syntactic information from a resource-rich source language to a resource-poor target language. In contrast to past work that used bitexts to transfer analyses of specific sentences at the token level, we instead use features to transfer the behavior of words at a type level. In a discriminative dependency parsing framework, our approach produces gains across a range of target languages, using two different lowresource training methodologies (one weakly supervised and one indirectly supervised) and two different dictionary sources (one manually constructed and one automatically constructed).
Author: Bernd Bohnet ; Joakim Nivre
Abstract: Most current dependency parsers presuppose that input words have been morphologically disambiguated using a part-of-speech tagger before parsing begins. We present a transitionbased system for joint part-of-speech tagging and labeled dependency parsing with nonprojective trees. Experimental evaluation on Chinese, Czech, English and German shows consistent improvements in both tagging and parsing accuracy when compared to a pipeline system, which lead to improved state-of-theart results for all languages.
4 0.57863647 70 emnlp-2012-Joint Chinese Word Segmentation, POS Tagging and Parsing
Author: Xian Qian ; Yang Liu
Abstract: In this paper, we propose a novel decoding algorithm for discriminative joint Chinese word segmentation, part-of-speech (POS) tagging, and parsing. Previous work often used a pipeline method Chinese word segmentation followed by POS tagging and parsing, which suffers from error propagation and is unable to leverage information in later modules for earlier components. In our approach, we train the three individual models separately during training, and incorporate them together in a unified framework during decoding. We extend the CYK parsing algorithm so that it can deal with word segmentation and POS tagging features. As far as we know, this is the first work on joint Chinese word segmentation, POS tagging and parsing. Our experimental results on Chinese Tree Bank 5 corpus show that our approach outperforms the state-of-the-art pipeline system. –
5 0.57692522 64 emnlp-2012-Improved Parsing and POS Tagging Using Inter-Sentence Consistency Constraints
Author: Alexander Rush ; Roi Reichart ; Michael Collins ; Amir Globerson
Abstract: State-of-the-art statistical parsers and POS taggers perform very well when trained with large amounts of in-domain data. When training data is out-of-domain or limited, accuracy degrades. In this paper, we aim to compensate for the lack of available training data by exploiting similarities between test set sentences. We show how to augment sentencelevel models for parsing and POS tagging with inter-sentence consistency constraints. To deal with the resulting global objective, we present an efficient and exact dual decomposition decoding algorithm. In experiments, we add consistency constraints to the MST parser and the Stanford part-of-speech tagger and demonstrate significant error reduction in the domain adaptation and the lightly supervised settings across five languages.
6 0.57616222 109 emnlp-2012-Re-training Monolingual Parser Bilingually for Syntactic SMT
7 0.57581353 136 emnlp-2012-Weakly Supervised Training of Semantic Parsers
8 0.57293683 24 emnlp-2012-Biased Representation Learning for Domain Adaptation
9 0.5680452 42 emnlp-2012-Entropy-based Pruning for Phrase-based Machine Translation
10 0.56567091 5 emnlp-2012-A Discriminative Model for Query Spelling Correction with Latent Structural SVM
11 0.56492573 89 emnlp-2012-Mixed Membership Markov Models for Unsupervised Conversation Modeling
12 0.56261921 18 emnlp-2012-An Empirical Investigation of Statistical Significance in NLP
13 0.56257051 131 emnlp-2012-Unified Dependency Parsing of Chinese Morphological and Syntactic Structures
14 0.56139523 124 emnlp-2012-Three Dependency-and-Boundary Models for Grammar Induction
15 0.56046188 14 emnlp-2012-A Weakly Supervised Model for Sentence-Level Semantic Orientation Analysis with Multiple Experts
16 0.5579645 92 emnlp-2012-Multi-Domain Learning: When Do Domains Matter?
17 0.55792075 66 emnlp-2012-Improving Transition-Based Dependency Parsing with Buffer Transitions
18 0.55777246 129 emnlp-2012-Type-Supervised Hidden Markov Models for Part-of-Speech Tagging with Incomplete Tag Dictionaries
19 0.5574019 82 emnlp-2012-Left-to-Right Tree-to-String Decoding with Prediction
20 0.55737203 54 emnlp-2012-Forced Derivation Tree based Model Training to Statistical Machine Translation