emnlp emnlp2010 emnlp2010-119 knowledge-graph by maker-knowledge-mining
Source: pdf
Author: Mark Dredze ; Tim Oates ; Christine Piatko
Abstract: Domain adaptation, the problem of adapting a natural language processing system trained in one domain to perform well in a different domain, has received significant attention. This paper addresses an important problem for deployed systems that has received little attention – detecting when such adaptation is needed by a system operating in the wild, i.e., performing classification over a stream of unlabeled examples. Our method uses Aodifst uannlaceb,e a dm eextraicm fpolre detecting tshhoifdts u sine sd Aatastreams, combined with classification margins to detect domain shifts. We empirically show effective domain shift detection on a variety of data sets and shift conditions.
Reference: text
sentIndex sentText sentNum sentScore
1 edu , oate s @ umbc Abstract Domain adaptation, the problem of adapting a natural language processing system trained in one domain to perform well in a different domain, has received significant attention. [sent-5, score-0.352]
2 Our method uses Aodifst uannlaceb,e a dm eextraicm fpolre detecting tshhoifdts u sine sd Aatastreams, combined with classification margins to detect domain shifts. [sent-9, score-0.678]
3 We empirically show effective domain shift detection on a variety of data sets and shift conditions. [sent-10, score-0.935]
4 This problem of domain shift is a pervasive problem in NLP in which any kind of model a parser, a POS tagger, a sentiment classifier – is tested on data that do not match the training data. [sent-28, score-0.722]
5 Given a model and a stream of unlabeled in– stances, we are interested in automatically detecting changes in the feature distribution that negatively impact classification accuracy. [sent-29, score-0.636]
6 Other tasks related to changes in data distributions, like detecting concept drift in which the labeling function changes, may require labeled instances, but that is not the focus ofthis paper. [sent-33, score-0.492]
7 There is significant work on the related problem of adapting a classifier for a known domain shift. [sent-34, score-0.436]
8 Versions of this problem include adapting using only unlabeled target domain data (Blitzer et al. [sent-35, score-0.447]
9 , 2007; Jiang and Zhai, 2007), adapting using a limited amount of target domain labeled data (Daum ´e, 2007; Finkel and Manning, 2009), and learning across multiple domains simultaneously in an online setting (Dredze and Crammer, 2008b). [sent-37, score-0.526]
10 If not, we seek methods that detect this shift and trigger the use of an adaptation method. [sent-44, score-0.461]
11 Our domain shift detection problem can be decomposed into two subproblems: detecting distributional changes in streams of real numbers, and representing a stream of examples as a stream of real numbers informative for distribution change detection. [sent-45, score-1.965]
12 , previously used in other domain adaptation work (Blitzer et al. [sent-48, score-0.442]
13 Our experiments include evaluations on commonly used domain adaptation data and false change scenarios, as well as comparisons to supervised detection methods that observe label values, or have knowledge of the target domain. [sent-55, score-0.88]
14 We then show that margin based methods effectively capture information to detect domain shifts, and propose an alternate way of generating informative margin values. [sent-58, score-0.913]
15 2 Domain Shifts in Language Data The study of domain shifts in language data has been the purview of domain adaptation and transfer learning, which seek to adapt or transfer a model learned on one source domain with labeled data to another 586 target domain with few or no labeled examples. [sent-60, score-1.772]
16 Empirical work on NLP domain shifts has focused on the former. [sent-64, score-0.529]
17 (2007) learned correspondences between features across domains and Jiang and Zhai (2007) weighted source domain examples by their similarity to the target distribution. [sent-66, score-0.542]
18 First, a change in domain will be signaled by a change in the feature distributions. [sent-68, score-0.566]
19 A similar problem to the one we consider is that of concept drift, where a stream of examples are labeled with a shifting labeling function (concept) (Nishida and Yamauchi, 2007; Widmer and Kubat, 1996). [sent-75, score-0.498]
20 First, concept drift can be measured using a stream of labeled examples, so system accuracy is directly measured. [sent-77, score-0.577]
21 Another concept drift detection algorithm, STEPD, uses a statistical test to continually monitor the possibly changing stream, measuring system accuracy directly, again using the labels it receives for each example (Nishida, 2008). [sent-80, score-0.346]
22 Second, concept drift assumes only changes in the labeling function, whereas domain adaptation relies on feature distribution changes. [sent-82, score-0.746]
23 Several properties of detecting domain shifts in natural language streams distinguish it from traditional domain adaptation, concept drift, and other related tasks: • No Target Distribution Examples Blitzer et Nal. [sent-83, score-1.164]
24 o (2007) tesD tiimstaritbe tthieo nlo Essx ainm accuracy ferro met domain shift by discriminating between two data distributions. [sent-84, score-0.612]
25 Computationally Constrained Our approach mCousmt p buet fast, as we expect to run our pdpormoaacihn shift detector alongside a deployed NLP sys- tem. [sent-90, score-0.424]
26 Despite these challenges, we show unsupervised stream-based methods that effectively identify shifts in domain in language data. [sent-94, score-0.529]
27 Our methods have low false positive rates of change detection, which is important since examples within a single domain display a large amount of variance, which could be mistaken for a domain change. [sent-96, score-1.031]
28 The maintainer of the system may be notified that performance is suffering, labels can be obtained for a sample of instances from the stream for retraining, or large volumes of unlabeled instances can be used for instance reweighting (Jiang and Zhai, 2007). [sent-98, score-0.4]
29 We selected three data sets commonly used in domain adaptation: spam (Jiang and Zhai, 2007), ACE 2005 named entity recognition (Jiang and Zhai, 2007), and sentiment (Blitzer et al. [sent-100, score-0.513]
30 Note that in all experiments, a shift in the domain yields a decrease in system accuracy. [sent-103, score-0.572]
31 We include an additional two types (music and video from Dredze and Crammer) in our false shift experiments and use unigram and bigram features, following Blitzer et al. [sent-110, score-0.414]
32 4 The A-Distance Our approach to detecting domain shifts in data streams that negatively impact system accuracy is based on the ability to (1) detect distributional changes in streams of real numbers and (2) convert document streams to streams of informative real numbers. [sent-111, score-1.427]
33 Theoretical work on domain adaptation showed that the A-distance (Kifer et al. [sent-113, score-0.474]
34 Given our interest in streaming data we return to the original stream formulation of A-distance. [sent-119, score-0.379]
35 Since the A-distance processes a stream of real numbers, we Ane-deids aton represent an example using a real number, such as the classification margin for that example. [sent-139, score-0.694]
36 We signal a domain shift when the A-distance between P and P0 is large (greater etha An- ? [sent-141, score-0.572]
37 Note that any change detection would be a false positive because all values were sampled from the same distribution. [sent-164, score-0.423]
38 The vertical line at 500 instances marks the point of domain shift. [sent-176, score-0.38]
39 Note that in all cases, the mean accuracy drops, as do the mean margin values, demonstrating that both can indicate domain shifts. [sent-178, score-0.667]
40 5 A-Distance Over Margins Since shifts in domains correlate with changes in distributions, it is natural to begin by considering the observed features in each example. [sent-179, score-0.36]
41 Since the A-distance assumes a stream os fc single values, we can apply an A-distance detector to each feature (e. [sent-185, score-0.454]
42 We begin by examining visually the information content of the margin with regards to predicting a domain shift. [sent-196, score-0.555]
43 2 describes the setup, and the first row of the figure illustrates the effects of the shift on the source domain classifier’s empirical accuracy, measured on a window of the previous 100 examples. [sent-198, score-0.616]
44 2 shows the average unsigned margin value of an SVM classifier computed over the previous 100 examples in the stream. [sent-204, score-0.453]
45 The two dashed horizontal lines indicate the average margin value over source and target examples. [sent-205, score-0.404]
46 This difference suggests that the margin can be examined directly to detect a domain shift. [sent-207, score-0.622]
47 We evaluated the ability of A-distance trackers to deWtecet svuaclhu changes biinl margin vdiaslutaensc by scimkeurlsa ttoing domain shifts using each domain pair in a task (books to dvds, weblogs to newswire, etc. [sent-209, score-1.183]
48 For each domain shift setting, we first trained a classifier on 1000 source domain instances. [sent-211, score-1.01]
49 The first 500 examples in the stream were used for calibrating our change detection methods. [sent-221, score-0.61]
50 2 Each plot represents one of the three classifiers (SVM, MIRA, CW) plotted on the vertical axis, where each point’s y-value indicates the number of examples observed after a shift occurred before the A-distance detector registered a change. [sent-235, score-0.668]
51 ) Notice that in many cases, a change was registered within 300 examples, showing that domain shifts can be reasonably detected using the margin values alone. [sent-240, score-0.983]
52 Equally important to detecting changes is robustness to false changes. [sent-241, score-0.399]
53 We evaluated the margin detector for false positives in two ways. [sent-242, score-0.619]
54 The highest false positive rate was about 1% (CW), while for the SVM experiments, not a single detector fired prematurely in any experiment. [sent-245, score-0.356]
55 Second, we sought to test the robustness of the method over a long stream of examples where no change occurred. [sent-248, score-0.509]
56 In this experiment, we selected 11 domains that had a sufficient number of examples to consider a long stream of source domain exam- ples. [sent-249, score-0.777]
57 3 Rather than use 500 source domain examples followed by 1500 target domain examples, all 2000 examples were from the source domain. [sent-250, score-0.943]
58 ) 6 Confidence Weighted Margins In the previous section, we showed that margin values could be used to detect domain shifts. [sent-253, score-0.654]
59 We now explore ways to reduce the number of target domain examples needed to detect domain shift by improving the margin values. [sent-254, score-1.34]
60 kAenro itdheenrt itfaisesk twhahte n re plireesd on margins as measures of confidence is active learning, where uncertainty sampling for margin based systems is determined based on the magnitude of the predicted margin. [sent-257, score-0.423]
61 In each plot, CWPM (normalized margin) is plotted on the x-axis, indicating how many examples from the target domain were observed before the detector identified a change. [sent-277, score-0.649]
62 Of the 38 shifts, CWPM detected domain shifts faster than an SVM 34 times, MIRA 26 times and CW 27 times. [sent-281, score-0.58]
63 We repeated the experiments to detect false positives for each margin based method. [sent-282, score-0.524]
64 Table 1 shows the false positives for the 38 domain shifts considered as well as the 11 false shift domain shifts. [sent-283, score-1.465]
65 This shows that CWPM is a more useful indicator for detecting domain changes. [sent-285, score-0.458]
66 7 Gradual Shifts We have shown detection of sudden shifts between the source and target domains. [sent-286, score-0.421]
67 We evaluate this by modifying the stream as follows: the first 500 instances come from the source domain, and the remaining 1500 are sampled randomly from the source and target domains. [sent-288, score-0.472]
68 The probability of an instance being drawn from the target domain at time i is pi(x = target) = 15i00, where i counts from the start of the shift at index 500. [sent-289, score-0.629]
69 The probability of sampling target domain data increases uniformly over the stream. [sent-290, score-0.367]
70 At index 750 after the start of the shift each domain is equally likely. [sent-291, score-0.572]
71 4 shows CWPM still performs best, but results are close (SVM: 22 of 32, MIRA & VMS1 48062 0 0 2046CW80PM10saec20ntim104e5nt160ARIM1 8426 0 0 02406CW80PM1021406WC1 64208 0 0 2046CW80PM1021406 Figure 4: Gradual shift detection with SVM, MIRA or CW vs. [sent-294, score-0.363]
72 olr56sit:ehSCMtmpsVWoIRMPiAtves(TFruD30)8152oSmhbiasftnervSFhdali1sne6P03Strshuiefdoman shift and false domain shift experiments for methods in corresponding sections. [sent-298, score-0.986]
73 Each setting was run 10 times, resulting in 380 true domain shifts and 110 false shifts. [sent-299, score-0.716]
74 In particular, we investigate two types of supervised knowledge: the labels of examples in the stream and knowledge of the target domain. [sent-307, score-0.438]
75 4 we showed that both the margin and recent classifier accuracy indicate when shifts in domains occur (Fig. [sent-311, score-0.662]
76 Over this 1/0 stream produced by checking classifier accuracy we ran an A-distance detector, with isniftieerrv aaclsc set cfyor w 1es raannd a0ns (10,000 ucnei fdoertmec samples to calibrate the threshold for a false positive rate of 0. [sent-318, score-0.655]
77 ) If an unusual number of 0s or 1s occur more or less mistakes than on the source domain a change is detected. [sent-320, score-0.482]
78 ) Despite this supervised information, CWPM still detects domain changes faster than with labeled examples. [sent-323, score-0.484]
79 While the average accuracy drops, the instantaneous value is very noisy, suggesting that even this additional information may not yield better domain shift detection. [sent-326, score-0.612]
80 Figure 5: An A-distance accuracy detector, run over a stream of 1s and 0s indicating correct and incorrect predictions oFifg gthuree c 5la:s Asinfi eAr on examples rina a s dteretaemct. [sent-330, score-0.421]
81 o ,T rhuen n b ouvlker ro af points aofbo 1vse a nthde 0lisn ien dinicdaictiantge tchoratr eCcWt aPndM i nisc more perffeedcitcitvieo nast detecting domain change. [sent-331, score-0.458]
82 CWPM had a single false positive and the accuracy detector had no false positives. [sent-332, score-0.548]
83 In this setting, we know that a shift will occur and we know to which domain it will occur. [sent-335, score-0.572]
84 This requires a sample of (unlabeled) target domain examples when the target domain is not known ahead of time. [sent-336, score-0.823]
85 Using a common approach to detecting domain differences when data is available from both domains (Ben-David et al. [sent-337, score-0.5]
86 6 shows the detection rate of CWPM versus A-distance over the domain classifier stream. [sent-346, score-0.495]
87 a Asssi efiexris very fast, in almost every case (save 1) less than 400 examples after the shift happens. [sent-348, score-0.351]
88 When CWPM is slow to detect a change (over 400 examples), the domain classifier is the clear winner. [sent-349, score-0.589]
89 These results suggest that while a sample of target domain examples is very helpful, our CWPM ap- proach can also be effective when such samples are not available. [sent-351, score-0.501]
90 An alternative formulation of domain adaptation trains on different corpora from many different domains, then uses linear combinations of models trained on the different corpora(McClosky et al. [sent-360, score-0.442]
91 Work in novelty detection is relevant to the task of detecting domain shifts (Scholkopf et al. [sent-362, score-0.778]
92 , 2000), Figure 6: A-distance over a stream of 1s and 0s produced by a supervised classifier trained to differentiate between tFhieg source Aan-ddi target d oovmeari an. [sent-363, score-0.477]
93 aHinoewdev toe dr, ffoferr many esh biefttws, etehen margin based A-distance detector is still competitive. [sent-365, score-0.437]
94 CWPM had a single false positive while the domain classifier smtraeragmin h baads e2d fa Al-sed positives tine cthtoerse is experiments. [sent-366, score-0.648]
95 We are also motivated by the problem of detecting genre shift in addition to domain shift, as in the ACE 2005 data set shifts from newswire to transcripts and blogs. [sent-368, score-1.017]
96 10 Conclusion While there are a number of methods for domain adaptation, a system first needs to determine that a domain shift has occurred. [sent-379, score-0.882]
97 We have presented meth594 ods for automatically detecting such domain shifts from a stream of (unlabeled) examples that require limited computation and memory by virtue of operating on fixed-size windows of data. [sent-380, score-1.058]
98 Our methods were evaluated empirically on a variety of domain shifts using NLP data sets and are shown to be sensitive to shifts while maintaining a low rate of false positives. [sent-381, score-0.9]
99 Additionally, we showed improved detection results using a probabilistic margin based on Confidence Weighted learning. [sent-382, score-0.378]
100 Our methods are promising as tools to accompany the deployment of domain adaptation algorithms, so that a complete system can first identify when a domain shift has occurred before automatically adapting to the new domain. [sent-384, score-1.056]
wordName wordTfidf (topN-words)
[('domain', 0.31), ('stream', 0.292), ('cwpm', 0.28), ('shift', 0.262), ('margin', 0.245), ('shifts', 0.219), ('dredze', 0.163), ('detector', 0.162), ('cw', 0.156), ('false', 0.152), ('detecting', 0.148), ('adaptation', 0.132), ('change', 0.128), ('drift', 0.128), ('blitzer', 0.128), ('crammer', 0.125), ('mira', 0.121), ('intervals', 0.105), ('detection', 0.101), ('streams', 0.1), ('changes', 0.099), ('margins', 0.094), ('examples', 0.089), ('streaming', 0.087), ('confidence', 0.084), ('classifier', 0.084), ('concept', 0.077), ('svm', 0.074), ('koby', 0.07), ('spam', 0.067), ('detect', 0.067), ('sentiment', 0.066), ('ace', 0.064), ('da', 0.062), ('books', 0.06), ('positives', 0.06), ('plot', 0.059), ('classification', 0.059), ('horizontal', 0.058), ('zhai', 0.057), ('target', 0.057), ('axis', 0.054), ('jiang', 0.053), ('kifer', 0.052), ('muthukrishnan', 0.052), ('nishida', 0.052), ('detected', 0.051), ('electronics', 0.05), ('real', 0.049), ('shai', 0.047), ('informative', 0.046), ('daum', 0.046), ('dvds', 0.045), ('kitchen', 0.045), ('samples', 0.045), ('falls', 0.044), ('source', 0.044), ('adapting', 0.042), ('domains', 0.042), ('positive', 0.042), ('tracking', 0.04), ('accuracy', 0.04), ('genre', 0.04), ('labeled', 0.04), ('broadcast', 0.038), ('interval', 0.038), ('unlabeled', 0.038), ('newswire', 0.038), ('named', 0.037), ('mean', 0.036), ('setting', 0.035), ('instances', 0.035), ('aonfc', 0.035), ('dewdney', 0.035), ('finn', 0.035), ('histograms', 0.035), ('klinkenberg', 0.035), ('kyosuke', 0.035), ('petrovi', 0.035), ('summoned', 0.035), ('unsigned', 0.035), ('widmer', 0.035), ('detects', 0.035), ('vertical', 0.035), ('bp', 0.035), ('fernando', 0.033), ('obama', 0.033), ('entity', 0.033), ('showed', 0.032), ('plotted', 0.031), ('reviews', 0.031), ('feldman', 0.03), ('rai', 0.03), ('executives', 0.03), ('tracker', 0.03), ('registered', 0.03), ('gradual', 0.03), ('toe', 0.03), ('gandrabur', 0.03), ('pipeline', 0.03)]
simIndex simValue paperId paperTitle
same-paper 1 0.99999905 119 emnlp-2010-We're Not in Kansas Anymore: Detecting Domain Changes in Streams
Author: Mark Dredze ; Tim Oates ; Christine Piatko
Abstract: Domain adaptation, the problem of adapting a natural language processing system trained in one domain to perform well in a different domain, has received significant attention. This paper addresses an important problem for deployed systems that has received little attention – detecting when such adaptation is needed by a system operating in the wild, i.e., performing classification over a stream of unlabeled examples. Our method uses Aodifst uannlaceb,e a dm eextraicm fpolre detecting tshhoifdts u sine sd Aatastreams, combined with classification margins to detect domain shifts. We empirically show effective domain shift detection on a variety of data sets and shift conditions.
2 0.1880945 104 emnlp-2010-The Necessity of Combining Adaptation Methods
Author: Ming-Wei Chang ; Michael Connor ; Dan Roth
Abstract: Problems stemming from domain adaptation continue to plague the statistical natural language processing community. There has been continuing work trying to find general purpose algorithms to alleviate this problem. In this paper we argue that existing general purpose approaches usually only focus on one of two issues related to the difficulties faced by adaptation: 1) difference in base feature statistics or 2) task differences that can be detected with labeled data. We argue that it is necessary to combine these two classes of adaptation algorithms, using evidence collected through theoretical analysis and simulated and real-world data experiments. We find that the combined approach often outperforms the individual adaptation approaches. By combining simple approaches from each class of adaptation algorithm, we achieve state-of-the-art results for both Named Entity Recognition adaptation task and the Preposition Sense Disambiguation adaptation task. Second, we also show that applying an adaptation algorithm that finds shared representation between domains often impacts the choice in adaptation algorithm that makes use of target labeled data.
3 0.18794401 30 emnlp-2010-Confidence in Structured-Prediction Using Confidence-Weighted Models
Author: Avihai Mejer ; Koby Crammer
Abstract: Confidence-Weighted linear classifiers (CW) and its successors were shown to perform well on binary and multiclass NLP problems. In this paper we extend the CW approach for sequence learning and show that it achieves state-of-the-art performance on four noun phrase chucking and named entity recognition tasks. We then derive few algorithmic approaches to estimate the prediction’s correctness of each label in the output sequence. We show that our approach provides a reliable relative correctness information as it outperforms other alternatives in ranking label-predictions according to their error. We also show empirically that our methods output close to absolute estimation of error. Finally, we show how to use this information to improve active learning.
4 0.12685366 39 emnlp-2010-EMNLP 044
Author: George Foster
Abstract: We describe a new approach to SMT adaptation that weights out-of-domain phrase pairs according to their relevance to the target domain, determined by both how similar to it they appear to be, and whether they belong to general language or not. This extends previous work on discriminative weighting by using a finer granularity, focusing on the properties of instances rather than corpus components, and using a simpler training procedure. We incorporate instance weighting into a mixture-model framework, and find that it yields consistent improvements over a wide range of baselines.
5 0.11366848 33 emnlp-2010-Cross Language Text Classification by Model Translation and Semi-Supervised Learning
Author: Lei Shi ; Rada Mihalcea ; Mingjun Tian
Abstract: In this paper, we introduce a method that automatically builds text classifiers in a new language by training on already labeled data in another language. Our method transfers the classification knowledge across languages by translating the model features and by using an Expectation Maximization (EM) algorithm that naturally takes into account the ambiguity associated with the translation of a word. We further exploit the readily available unlabeled data in the target language via semisupervised learning, and adapt the translated model to better fit the data distribution of the target language.
6 0.10909205 85 emnlp-2010-Negative Training Data Can be Harmful to Text Classification
7 0.10766554 41 emnlp-2010-Efficient Graph-Based Semi-Supervised Learning of Structured Tagging Models
8 0.10419531 84 emnlp-2010-NLP on Spoken Documents Without ASR
9 0.088810518 112 emnlp-2010-Unsupervised Discovery of Negative Categories in Lexicon Bootstrapping
10 0.086728498 62 emnlp-2010-Improving Mention Detection Robustness to Noisy Input
11 0.079526164 115 emnlp-2010-Uptraining for Accurate Deterministic Question Parsing
12 0.0792723 64 emnlp-2010-Incorporating Content Structure into Text Analysis Applications
13 0.0707523 20 emnlp-2010-Automatic Detection and Classification of Social Events
14 0.06929379 49 emnlp-2010-Extracting Opinion Targets in a Single and Cross-Domain Setting with Conditional Random Fields
15 0.063734025 43 emnlp-2010-Enhancing Domain Portability of Chinese Segmentation Model Using Chi-Square Statistics and Bootstrapping
16 0.058016066 11 emnlp-2010-A Semi-Supervised Approach to Improve Classification of Infrequent Discourse Relations Using Feature Vector Extension
17 0.057438467 78 emnlp-2010-Minimum Error Rate Training by Sampling the Translation Lattice
18 0.056043237 35 emnlp-2010-Discriminative Sample Selection for Statistical Machine Translation
19 0.054672696 27 emnlp-2010-Clustering-Based Stratified Seed Sampling for Semi-Supervised Relation Classification
20 0.053366113 83 emnlp-2010-Multi-Level Structured Models for Document-Level Sentiment Classification
topicId topicWeight
[(0, 0.221), (1, 0.122), (2, -0.107), (3, 0.075), (4, -0.107), (5, -0.019), (6, 0.263), (7, 0.183), (8, -0.016), (9, 0.2), (10, 0.124), (11, 0.088), (12, -0.131), (13, 0.155), (14, 0.036), (15, -0.026), (16, 0.005), (17, -0.097), (18, -0.118), (19, -0.076), (20, 0.02), (21, -0.104), (22, 0.074), (23, 0.003), (24, -0.047), (25, 0.068), (26, 0.016), (27, 0.056), (28, 0.082), (29, -0.05), (30, 0.032), (31, -0.068), (32, -0.064), (33, -0.013), (34, 0.006), (35, 0.056), (36, -0.052), (37, -0.043), (38, 0.005), (39, 0.066), (40, -0.015), (41, 0.043), (42, -0.11), (43, 0.001), (44, -0.02), (45, 0.027), (46, -0.08), (47, -0.161), (48, -0.075), (49, -0.041)]
simIndex simValue paperId paperTitle
same-paper 1 0.97278559 119 emnlp-2010-We're Not in Kansas Anymore: Detecting Domain Changes in Streams
Author: Mark Dredze ; Tim Oates ; Christine Piatko
Abstract: Domain adaptation, the problem of adapting a natural language processing system trained in one domain to perform well in a different domain, has received significant attention. This paper addresses an important problem for deployed systems that has received little attention – detecting when such adaptation is needed by a system operating in the wild, i.e., performing classification over a stream of unlabeled examples. Our method uses Aodifst uannlaceb,e a dm eextraicm fpolre detecting tshhoifdts u sine sd Aatastreams, combined with classification margins to detect domain shifts. We empirically show effective domain shift detection on a variety of data sets and shift conditions.
2 0.69690895 104 emnlp-2010-The Necessity of Combining Adaptation Methods
Author: Ming-Wei Chang ; Michael Connor ; Dan Roth
Abstract: Problems stemming from domain adaptation continue to plague the statistical natural language processing community. There has been continuing work trying to find general purpose algorithms to alleviate this problem. In this paper we argue that existing general purpose approaches usually only focus on one of two issues related to the difficulties faced by adaptation: 1) difference in base feature statistics or 2) task differences that can be detected with labeled data. We argue that it is necessary to combine these two classes of adaptation algorithms, using evidence collected through theoretical analysis and simulated and real-world data experiments. We find that the combined approach often outperforms the individual adaptation approaches. By combining simple approaches from each class of adaptation algorithm, we achieve state-of-the-art results for both Named Entity Recognition adaptation task and the Preposition Sense Disambiguation adaptation task. Second, we also show that applying an adaptation algorithm that finds shared representation between domains often impacts the choice in adaptation algorithm that makes use of target labeled data.
3 0.61316574 30 emnlp-2010-Confidence in Structured-Prediction Using Confidence-Weighted Models
Author: Avihai Mejer ; Koby Crammer
Abstract: Confidence-Weighted linear classifiers (CW) and its successors were shown to perform well on binary and multiclass NLP problems. In this paper we extend the CW approach for sequence learning and show that it achieves state-of-the-art performance on four noun phrase chucking and named entity recognition tasks. We then derive few algorithmic approaches to estimate the prediction’s correctness of each label in the output sequence. We show that our approach provides a reliable relative correctness information as it outperforms other alternatives in ranking label-predictions according to their error. We also show empirically that our methods output close to absolute estimation of error. Finally, we show how to use this information to improve active learning.
4 0.48323846 37 emnlp-2010-Domain Adaptation of Rule-Based Annotators for Named-Entity Recognition Tasks
Author: Laura Chiticariu ; Rajasekar Krishnamurthy ; Yunyao Li ; Frederick Reiss ; Shivakumar Vaithyanathan
Abstract: Named-entity recognition (NER) is an important task required in a wide variety of applications. While rule-based systems are appealing due to their well-known “explainability,” most, if not all, state-of-the-art results for NER tasks are based on machine learning techniques. Motivated by these results, we explore the following natural question in this paper: Are rule-based systems still a viable approach to named-entity recognition? Specifically, we have designed and implemented a high-level language NERL on top of SystemT, a general-purpose algebraic information extraction system. NERL is tuned to the needs of NER tasks and simplifies the process of building, understanding, and customizing complex rule-based named-entity annotators. We show that these customized annotators match or outperform the best published results achieved with machine learning techniques. These results confirm that we can reap the benefits of rule-based extractors’ explainability without sacrificing accuracy. We conclude by discussing lessons learned while building and customizing complex rule-based annotators and outlining several research directions towards facilitating rule development.
5 0.47029674 39 emnlp-2010-EMNLP 044
Author: George Foster
Abstract: We describe a new approach to SMT adaptation that weights out-of-domain phrase pairs according to their relevance to the target domain, determined by both how similar to it they appear to be, and whether they belong to general language or not. This extends previous work on discriminative weighting by using a finer granularity, focusing on the properties of instances rather than corpus components, and using a simpler training procedure. We incorporate instance weighting into a mixture-model framework, and find that it yields consistent improvements over a wide range of baselines.
6 0.46411127 41 emnlp-2010-Efficient Graph-Based Semi-Supervised Learning of Structured Tagging Models
7 0.44032302 84 emnlp-2010-NLP on Spoken Documents Without ASR
8 0.4388518 85 emnlp-2010-Negative Training Data Can be Harmful to Text Classification
9 0.34727344 21 emnlp-2010-Automatic Discovery of Manner Relations and its Applications
10 0.3349748 112 emnlp-2010-Unsupervised Discovery of Negative Categories in Lexicon Bootstrapping
11 0.3194775 33 emnlp-2010-Cross Language Text Classification by Model Translation and Semi-Supervised Learning
12 0.29840496 115 emnlp-2010-Uptraining for Accurate Deterministic Question Parsing
14 0.28649926 27 emnlp-2010-Clustering-Based Stratified Seed Sampling for Semi-Supervised Relation Classification
15 0.27935436 62 emnlp-2010-Improving Mention Detection Robustness to Noisy Input
16 0.26195353 43 emnlp-2010-Enhancing Domain Portability of Chinese Segmentation Model Using Chi-Square Statistics and Bootstrapping
18 0.24941066 64 emnlp-2010-Incorporating Content Structure into Text Analysis Applications
19 0.24881269 122 emnlp-2010-WikiWars: A New Corpus for Research on Temporal Expressions
20 0.23484614 49 emnlp-2010-Extracting Opinion Targets in a Single and Cross-Domain Setting with Conditional Random Fields
topicId topicWeight
[(3, 0.032), (10, 0.014), (12, 0.04), (29, 0.082), (30, 0.037), (32, 0.018), (52, 0.049), (56, 0.076), (62, 0.019), (66, 0.136), (72, 0.049), (76, 0.017), (77, 0.011), (79, 0.013), (85, 0.268), (87, 0.053), (89, 0.022)]
simIndex simValue paperId paperTitle
same-paper 1 0.76794654 119 emnlp-2010-We're Not in Kansas Anymore: Detecting Domain Changes in Streams
Author: Mark Dredze ; Tim Oates ; Christine Piatko
Abstract: Domain adaptation, the problem of adapting a natural language processing system trained in one domain to perform well in a different domain, has received significant attention. This paper addresses an important problem for deployed systems that has received little attention – detecting when such adaptation is needed by a system operating in the wild, i.e., performing classification over a stream of unlabeled examples. Our method uses Aodifst uannlaceb,e a dm eextraicm fpolre detecting tshhoifdts u sine sd Aatastreams, combined with classification margins to detect domain shifts. We empirically show effective domain shift detection on a variety of data sets and shift conditions.
2 0.57515001 35 emnlp-2010-Discriminative Sample Selection for Statistical Machine Translation
Author: Sankaranarayanan Ananthakrishnan ; Rohit Prasad ; David Stallard ; Prem Natarajan
Abstract: Production of parallel training corpora for the development of statistical machine translation (SMT) systems for resource-poor languages usually requires extensive manual effort. Active sample selection aims to reduce the labor, time, and expense incurred in producing such resources, attaining a given performance benchmark with the smallest possible training corpus by choosing informative, nonredundant source sentences from an available candidate pool for manual translation. We present a novel, discriminative sample selection strategy that preferentially selects batches of candidate sentences with constructs that lead to erroneous translations on a held-out development set. The proposed strategy supports a built-in diversity mechanism that reduces redundancy in the selected batches. Simulation experiments on English-to-Pashto and Spanish-to-English translation tasks demon- strate the superiority of the proposed approach to a number of competing techniques, such as random selection, dissimilarity-based selection, as well as a recently proposed semisupervised active learning strategy.
3 0.57347339 86 emnlp-2010-Non-Isomorphic Forest Pair Translation
Author: Hui Zhang ; Min Zhang ; Haizhou Li ; Eng Siong Chng
Abstract: This paper studies two issues, non-isomorphic structure translation and target syntactic structure usage, for statistical machine translation in the context of forest-based tree to tree sequence translation. For the first issue, we propose a novel non-isomorphic translation framework to capture more non-isomorphic structure mappings than traditional tree-based and tree-sequence-based translation methods. For the second issue, we propose a parallel space searching method to generate hypothesis using tree-to-string model and evaluate its syntactic goodness using tree-to-tree/tree sequence model. This not only reduces the search complexity by merging spurious-ambiguity translation paths and solves the data sparseness issue in training, but also serves as a syntax-based target language model for better grammatical generation. Experiment results on the benchmark data show our proposed two solutions are very effective, achieving significant performance improvement over baselines when applying to different translation models.
4 0.5708946 23 emnlp-2010-Automatic Keyphrase Extraction via Topic Decomposition
Author: Zhiyuan Liu ; Wenyi Huang ; Yabin Zheng ; Maosong Sun
Abstract: Existing graph-based ranking methods for keyphrase extraction compute a single importance score for each word via a single random walk. Motivated by the fact that both documents and words can be represented by a mixture of semantic topics, we propose to decompose traditional random walk into multiple random walks specific to various topics. We thus build a Topical PageRank (TPR) on word graph to measure word importance with respect to different topics. After that, given the topic distribution of the document, we further calculate the ranking scores of words and extract the top ranked ones as keyphrases. Experimental results show that TPR outperforms state-of-the-art keyphrase extraction methods on two datasets under various evaluation metrics.
5 0.56849164 69 emnlp-2010-Joint Training and Decoding Using Virtual Nodes for Cascaded Segmentation and Tagging Tasks
Author: Xian Qian ; Qi Zhang ; Yaqian Zhou ; Xuanjing Huang ; Lide Wu
Abstract: Many sequence labeling tasks in NLP require solving a cascade of segmentation and tagging subtasks, such as Chinese POS tagging, named entity recognition, and so on. Traditional pipeline approaches usually suffer from error propagation. Joint training/decoding in the cross-product state space could cause too many parameters and high inference complexity. In this paper, we present a novel method which integrates graph structures of two subtasks into one using virtual nodes, and performs joint training and decoding in the factorized state space. Experimental evaluations on CoNLL 2000 shallow parsing data set and Fourth SIGHAN Bakeoff CTB POS tagging data set demonstrate the superiority of our method over cross-product, pipeline and candidate reranking approaches.
6 0.565965 67 emnlp-2010-It Depends on the Translation: Unsupervised Dependency Parsing via Word Alignment
7 0.56322837 120 emnlp-2010-What's with the Attitude? Identifying Sentences with Attitude in Online Discussions
8 0.56276035 78 emnlp-2010-Minimum Error Rate Training by Sampling the Translation Lattice
9 0.56100631 58 emnlp-2010-Holistic Sentiment Analysis Across Languages: Multilingual Supervised Latent Dirichlet Allocation
10 0.5602895 49 emnlp-2010-Extracting Opinion Targets in a Single and Cross-Domain Setting with Conditional Random Fields
11 0.55963212 105 emnlp-2010-Title Generation with Quasi-Synchronous Grammar
12 0.55854732 82 emnlp-2010-Multi-Document Summarization Using A* Search and Discriminative Learning
13 0.55728757 18 emnlp-2010-Assessing Phrase-Based Translation Models with Oracle Decoding
14 0.55633175 104 emnlp-2010-The Necessity of Combining Adaptation Methods
15 0.55604565 25 emnlp-2010-Better Punctuation Prediction with Dynamic Conditional Random Fields
16 0.55527151 84 emnlp-2010-NLP on Spoken Documents Without ASR
17 0.55392975 45 emnlp-2010-Evaluating Models of Latent Document Semantics in the Presence of OCR Errors
18 0.55320311 30 emnlp-2010-Confidence in Structured-Prediction Using Confidence-Weighted Models
19 0.5523876 31 emnlp-2010-Constraints Based Taxonomic Relation Classification
20 0.55232358 63 emnlp-2010-Improving Translation via Targeted Paraphrasing