emnlp emnlp2013 emnlp2013-29 knowledge-graph by maker-knowledge-mining
Source: pdf
Author: Di Wang ; Chenyan Xiong ; William Yang Wang
Abstract: Chenyan Xiong School of Computer Science Carnegie Mellon University Pittsburgh, PA 15213, USA cx@ c s . cmu .edu William Yang Wang School of Computer Science Carnegie Mellon University Pittsburgh, PA 15213, USA ww@ cmu .edu might not be generalizable to other domains (BenDavid et al., 2006; Ben-David et al., 2010). Multi-Domain learning (MDL) assumes that the domain labels in the dataset are known. However, when there are multiple metadata at- tributes available, it is not always straightforward to select a single best attribute for domain partition, and it is possible that combining more than one metadata attributes (including continuous attributes) can lead to better MDL performance. In this work, we propose an automatic domain partitioning approach that aims at providing better domain identities for MDL. We use a supervised clustering approach that learns the domain distance between data instances , and then cluster the data into better domains for MDL. Our experiment on real multi-domain datasets shows that using our automatically generated domain partition improves over popular MDL methods.
Reference: text
sentIndex sentText sentNum sentScore
1 edu William Yang Wang School of Computer Science Carnegie Mellon University Pittsburgh, PA 15213, USA ww@ cmu . [sent-5, score-0.047]
2 edu might not be generalizable to other domains (BenDavid et al. [sent-6, score-0.161]
3 Multi-Domain learning (MDL) assumes that the domain labels in the dataset are known. [sent-9, score-0.343]
4 However, when there are multiple metadata at- tributes available, it is not always straightforward to select a single best attribute for domain partition, and it is possible that combining more than one metadata attributes (including continuous attributes) can lead to better MDL performance. [sent-10, score-1.204]
5 In this work, we propose an automatic domain partitioning approach that aims at providing better domain identities for MDL. [sent-11, score-0.929]
6 We use a supervised clustering approach that learns the domain distance between data instances , and then cluster the data into better domains for MDL. [sent-12, score-0.726]
7 Our experiment on real multi-domain datasets shows that using our automatically generated domain partition improves over popular MDL methods. [sent-13, score-0.55]
8 d, Multi-domain learning (MDL) methods assumes that data come from several domains and make use of domain labels to improve modeling performance (Daum ´e III, 2007). [sent-16, score-0.504]
9 The motivation of using MDL is that datasets from different domains could be different, in two ways. [sent-17, score-0.194]
10 First, the feature distribution p(x) could be domain specific, meaning that the importance of each feature is different across domains. [sent-18, score-0.343]
11 Second, the distribution of label Y given X, p(y|x), of difftheeren dtis tdroibmuatiionns ocfo ulaldbe lb eY d giifvfeerne Xnt. [sent-19, score-0.053]
12 T(yh|exs)e, dofif fdeirf-ences could create problems for traditional machine learning methods: models learned from one domain 869 One common assumption of MDL methods is that the domain identities are pre-defined. [sent-20, score-0.795]
13 For example, in the multi-domain Amazon product review dataset (Finkel and Manning, 2009), the product categories are typically used as the domain identities. [sent-21, score-0.567]
14 Consider the Amazon product reviews, where we have multiple attributes attached to each review: for example, product category, reviewer location, price, and number of feedback. [sent-24, score-0.244]
15 Or we should use all of these meta-data and partition the data into many small domains? [sent-26, score-0.161]
16 In this paper, we investigate the problem of automatic domain partitioning. [sent-27, score-0.343]
17 We propose an empirical domain difference testing method to exam- ine whether two groups of data are i. [sent-28, score-0.48]
18 d, or generated from different distributions, and how different they are. [sent-30, score-0.023]
19 Using this approach, we generate data pairs that belong to the same distribution, and data pairs that should be partitioned into different domains. [sent-31, score-0.12]
20 These pairs are then used as training data for a supervised clustering algorithm, which automatically partitions the dataset into several domains. [sent-32, score-0.193]
21 In the evaluation, we show that our automatically-partitioned domains improve the performances of two popular MDL methods on real sentiment analysis data sets. [sent-33, score-0.184]
22 (2013) proposed a MultiAttribute Multi-Domain learning (MAMD) method, which also exploited multiple dimensions of metaProceSe datintlges, o Wfa tsh ein 2g01to3n, C UoSnfAe,re 1n8c-e2 o1n O Ecmtopbier ic 2a0l1 M3. [sent-35, score-0.038]
23 hc o2d0s1 i3n A Nsastoucria lti Loan fgoura Cgoem Ppruotcaetsiosin agl, L piang eusis 8t6ic9s–873, data and provided extensions to two traditional MDL methods. [sent-37, score-0.025]
24 However, extensions to the MAMD setting may not be trivial for every MDL algorithm, while our method serves as a pre-processing step and can be easily used for all MDL approaches. [sent-38, score-0.025]
25 In addition to this, MAMD only works with categorical metadata, and can not fully utilize information in the form of continuous metadata values. [sent-39, score-0.45]
26 in For example, on Amazon sentiment analysis data, X is the feature matrix extracted from reviews, Y is the positive or negative label vector, and M is the metadata matrix associated with reviews (e. [sent-41, score-0.45]
27 Our approach works as follows: in training, we first use an empirical domain difference testing method to detect whether two groups of data should be considered as different domains; after that we apply supervised clustering to learn the distance metric between two data points, i. [sent-44, score-0.728]
28 1 Empirical Domain Difference Test The key motivation of MDL is that a model fits for one domain may not fit well for other domains. [sent-48, score-0.376]
29 Following the same motivation, we propose an empirical method for domain difference test called Domain Model Loss (DML) that provides us the domain difference score d(G1 , G2) between two groups of data G1 = {X1, Y1} and G2 = {X2, Y2}. [sent-49, score-0.831]
30 Domain Model Loss Ifthe mapping functions f1 : X1 → Y1 and f2 : X2 → Y2 are different for two d7→ata groups, we could directly use the disagreement of f1and f2 as domain difference score. [sent-50, score-0.385]
31 More 870 specifically, if we train two classifiers : X1 → Y1, : X2 → Y2 individually on G1 and G2, we could have th7→e K Y-fold empirical loss: fˆ2 fˆ1 lˆ(f1,G1) =K1XiError of f1on i-th fold of G1, lˆ(f2,G2) =K1XiError of f2on i-th fold of G2. [sent-51, score-0.072]
32 2 Supervised Clustering for Domain Partitioning Our domain difference test method calculates the distance between two partitioned data groups. [sent-55, score-0.539]
33 However, to directly use it for domain partitioning, we must go through all possible combinations of domain assignments in exponential time, which is infeasible. [sent-56, score-0.716]
34 Our solution is to use a polynomial-time supervised clustering method developed by Xing et al. [sent-57, score-0.091]
35 (2002) to learn a distance function that calculates the distance between any two data points. [sent-58, score-0.206]
36 Formally, given a set of data pairs D, which belong to different domains, and a set of data pairs S, which belong to the same domain, it learns a distance metric A by: mAaxg(A) s. [sent-59, score-0.257]
37 The metadata M are preprocessed as follows: 1) Each categorical attribute was converted to several binary questions, one per category, and each binary question was considered as one metadata dimension in ADP method. [sent-63, score-0.838]
38 For example, if categor- ical attribute “Product Type” has two values “Music” and “Electronics”, then there will be two dimensions of metadata corresponding to “Product Type” in ADP. [sent-64, score-0.447]
39 Two metadata dimensions correspond to binary questions: “Is Product Type Music” and “Is Product Type Electronics”. [sent-65, score-0.362]
40 2) Each continuous attribute was normalized by scaling between 0 and 1. [sent-66, score-0.13]
41 The training data S, D for metric learning are generated as follows: 1. [sent-67, score-0.072]
42 5, sample two equally sized groups, apply our domain difference testing method and find the difference between these data groups. [sent-69, score-0.509]
43 Assign distance to each pair of instances by the average distance of all partitions that partitions the pair into different groups. [sent-71, score-0.32]
44 Select top n similar pairs as S and top n different pairs as D. [sent-73, score-0.052]
45 The learned distance metric A now conveys the domain difference information obtained from our domain distance test results: which meta attributes are important for domain partitioning and which are not as important. [sent-74, score-1.516]
46 (2002), we transfer the instance’s metadata feature M by MBT, where BTB = A. [sent-76, score-0.324]
47 Then we use a clustering method on MBT, and the output is our domain partitioning result. [sent-77, score-0.54]
48 3 Experiment Methodology Datasets To evaluate our methods, we used two subsets of Amazon review corpus (Jindal and Liu, 2008), which originally contain 5. [sent-78, score-0.036]
49 8 million reviews with a variety of metadata about products and users. [sent-79, score-0.449]
50 The first subset (BOOK) contains 20,000 reviews on books published by eleven most popular publishers, while the second (PROD) is reviews about products within seven most common product categories. [sent-80, score-0.342]
51 We randomly split each dataset into training and testing sets with equal size. [sent-81, score-0.034]
52 The task is to predict a positive or negative label for each review. [sent-82, score-0.026]
53 Reviews of 4 or 5 stars are considered positive and 1or 2 stars are considered negative, while 3 stars reviews are excluded. [sent-84, score-0.328]
54 Each review has multiple metadata such as book’s publisher, product’s type, user’s state location, product price, review year, and number of other user feedback. [sent-85, score-0.49]
55 It creates an augmented feature space as the Cartesian product of the input features and the original domains plus a shared domain. [sent-88, score-0.279]
56 Then it uses a SVM classifier over the augmented feature space to obtain classification result. [sent-89, score-0.024]
57 The CW learning is an online update method that maintains probabilistic con- fidence for each parameter by keeping track of its variance. [sent-95, score-0.025]
58 Domain Partition Methods We evaluated the domain partition results provided by our ADP on the two MDL methods (FEDA & MDR). [sent-97, score-0.504]
59 For simplicity and efficiency, we use Naive Bayes as our base prediction model f1 and f2 to generate the domain model loss score, described in section 2. [sent-98, score-0.388]
60 In training data generation, we choose top 10% similar pairs as S and top 10% different pairs as D. [sent-100, score-0.052]
61 And given the learnt distance metric A, we use K-means to do the clustering. [sent-101, score-0.133]
62 The number of clusters is selected by five-fold cross-validation on training set. [sent-102, score-0.028]
63 We run each random partition ten times and took the average; 3) MAMD proposed by Joshi et al. [sent-108, score-0.161]
64 However, the original version of MAMD does not support continuous attribute such as price. [sent-110, score-0.13]
65 So we made an extension that sorts these values to ten bins and then treats them as categorical values. [sent-111, score-0.106]
66 As we discussed for Table 1, FEDA might be less sensitive to domain partition results, which causes high baseline performance and high ADP+FEDA performance with small K. [sent-114, score-0.504]
67 Since the performance trends to increase along with k until 50 in three figures (1(b), 1(c) and 1(d)), we believe that the groundtruth domain size is likely larger than 50. [sent-115, score-0.343]
68 These results clearly indicate ADP does provide more desirable domain assignments for MDL. [sent-116, score-0.373]
69 The domain selected by 1-Best such as publishers has only 11 domains, which limits the ability of 1-Best to completely express domain information. [sent-117, score-0.715]
70 And our generated domains integrate multiple metadata attributes, lead to more detailed domain partitions, and enhance the ability of MDL methods to capture the difference between different groups of data. [sent-118, score-0.954]
71 Although accuracies are growing with k in general, we also see that there are fluctuations on curves especially when curves are zoomed to a small range. [sent-119, score-0.129]
72 To get smoother results, we can sample more data to calculate domain similarity and repeat the K-means clustering with more different initializations. [sent-120, score-0.433]
73 5 Conclusions In this paper, we propose an Automatic Domain Partition (ADP) method that provides better domain identities for multi-domain learning methods. [sent-121, score-0.452]
74 We first propose a new approach to identify whether two data groups should be considered as different domains, by comparing the differences using Domain Model Loss. [sent-122, score-0.085]
75 We use a supervised clustering approach to train our model with labels generated by domain difference tests, and cluster the re-weighted metadata as our domain partition by K-means. [sent-123, score-1.374]
76 Experiments on real world multi-domain data show that the domain identities generated by our method can improve the performance of MDL models. [sent-124, score-0.475]
77 Distance metric learning with application to clustering with side-information. [sent-174, score-0.112]
wordName wordTfidf (topN-words)
[('mdl', 0.545), ('domain', 0.343), ('metadata', 0.324), ('feda', 0.219), ('mamd', 0.188), ('adp', 0.162), ('partition', 0.161), ('domains', 0.161), ('partitioning', 0.134), ('mdr', 0.125), ('dredze', 0.124), ('identities', 0.109), ('mj', 0.104), ('reviews', 0.1), ('dml', 0.094), ('product', 0.094), ('attribute', 0.085), ('distance', 0.084), ('prod', 0.082), ('crammer', 0.081), ('categorical', 0.081), ('partitions', 0.076), ('joshi', 0.075), ('koby', 0.071), ('clustering', 0.063), ('book', 0.063), ('carolyn', 0.063), ('mahesh', 0.063), ('mbt', 0.063), ('groups', 0.061), ('stars', 0.06), ('price', 0.06), ('cw', 0.057), ('attributes', 0.056), ('frustratingly', 0.054), ('ros', 0.054), ('electronics', 0.05), ('metric', 0.049), ('pittsburgh', 0.049), ('xing', 0.049), ('daum', 0.047), ('carnegie', 0.047), ('mellon', 0.047), ('cmu', 0.047), ('cluster', 0.047), ('mi', 0.047), ('continuous', 0.045), ('loss', 0.045), ('amazon', 0.042), ('difference', 0.042), ('blitzer', 0.042), ('jindal', 0.042), ('fernando', 0.041), ('shai', 0.04), ('calculates', 0.038), ('meta', 0.038), ('dimensions', 0.038), ('music', 0.037), ('belong', 0.036), ('review', 0.036), ('fold', 0.036), ('testing', 0.034), ('motivation', 0.033), ('partitioned', 0.032), ('assignments', 0.03), ('publishers', 0.029), ('william', 0.029), ('school', 0.029), ('finkel', 0.029), ('supervised', 0.028), ('clusters', 0.028), ('mhe', 0.027), ('aet', 0.027), ('bendavid', 0.027), ('btb', 0.027), ('ocfo', 0.027), ('smoother', 0.027), ('tributes', 0.027), ('zoomed', 0.027), ('label', 0.026), ('pairs', 0.026), ('nips', 0.026), ('curves', 0.026), ('extensions', 0.025), ('kulesza', 0.025), ('fidence', 0.025), ('publisher', 0.025), ('bins', 0.025), ('fluctuations', 0.025), ('sized', 0.025), ('accuracies', 0.025), ('products', 0.025), ('considered', 0.024), ('augmented', 0.024), ('identity', 0.024), ('cohen', 0.024), ('popular', 0.023), ('equally', 0.023), ('generated', 0.023), ('mk', 0.023)]
simIndex simValue paperId paperTitle
same-paper 1 1.0 29 emnlp-2013-Automatic Domain Partitioning for Multi-Domain Learning
Author: Di Wang ; Chenyan Xiong ; William Yang Wang
Abstract: Chenyan Xiong School of Computer Science Carnegie Mellon University Pittsburgh, PA 15213, USA cx@ c s . cmu .edu William Yang Wang School of Computer Science Carnegie Mellon University Pittsburgh, PA 15213, USA ww@ cmu .edu might not be generalizable to other domains (BenDavid et al., 2006; Ben-David et al., 2010). Multi-Domain learning (MDL) assumes that the domain labels in the dataset are known. However, when there are multiple metadata at- tributes available, it is not always straightforward to select a single best attribute for domain partition, and it is possible that combining more than one metadata attributes (including continuous attributes) can lead to better MDL performance. In this work, we propose an automatic domain partitioning approach that aims at providing better domain identities for MDL. We use a supervised clustering approach that learns the domain distance between data instances , and then cluster the data into better domains for MDL. Our experiment on real multi-domain datasets shows that using our automatically generated domain partition improves over popular MDL methods.
2 0.20818302 120 emnlp-2013-Learning Latent Word Representations for Domain Adaptation using Supervised Word Clustering
Author: Min Xiao ; Feipeng Zhao ; Yuhong Guo
Abstract: Domain adaptation has been popularly studied on exploiting labeled information from a source domain to learn a prediction model in a target domain. In this paper, we develop a novel representation learning approach to address domain adaptation for text classification with automatically induced discriminative latent features, which are generalizable across domains while informative to the prediction task. Specifically, we propose a hierarchical multinomial Naive Bayes model with latent variables to conduct supervised word clustering on labeled documents from both source and target domains, and then use the produced cluster distribution of each word as its latent feature representation for domain adaptation. We train this latent graphical model us- ing a simple expectation-maximization (EM) algorithm. We empirically evaluate the proposed method with both cross-domain document categorization tasks on Reuters-21578 dataset and cross-domain sentiment classification tasks on Amazon product review dataset. The experimental results demonstrate that our proposed approach achieves superior performance compared with alternative methods.
3 0.12683611 136 emnlp-2013-Multi-Domain Adaptation for SMT Using Multi-Task Learning
Author: Lei Cui ; Xilun Chen ; Dongdong Zhang ; Shujie Liu ; Mu Li ; Ming Zhou
Abstract: Domain adaptation for SMT usually adapts models to an individual specific domain. However, it often lacks some correlation among different domains where common knowledge could be shared to improve the overall translation quality. In this paper, we propose a novel multi-domain adaptation approach for SMT using Multi-Task Learning (MTL), with in-domain models tailored for each specific domain and a general-domain model shared by different domains. The parameters of these models are tuned jointly via MTL so that they can learn general knowledge more accurately and exploit domain knowledge better. Our experiments on a largescale English-to-Chinese translation task validate that the MTL-based adaptation approach significantly and consistently improves the translation quality compared to a non-adapted baseline. Furthermore, it also outperforms the individual adaptation of each specific domain.
4 0.11762263 169 emnlp-2013-Semi-Supervised Representation Learning for Cross-Lingual Text Classification
Author: Min Xiao ; Yuhong Guo
Abstract: Cross-lingual adaptation aims to learn a prediction model in a label-scarce target language by exploiting labeled data from a labelrich source language. An effective crosslingual adaptation system can substantially reduce the manual annotation effort required in many natural language processing tasks. In this paper, we propose a new cross-lingual adaptation approach for document classification based on learning cross-lingual discriminative distributed representations of words. Specifically, we propose to maximize the loglikelihood of the documents from both language domains under a cross-lingual logbilinear document model, while minimizing the prediction log-losses of labeled documents. We conduct extensive experiments on cross-lingual sentiment classification tasks of Amazon product reviews. Our experimental results demonstrate the efficacy of the pro- posed cross-lingual adaptation approach.
5 0.078574032 77 emnlp-2013-Exploiting Domain Knowledge in Aspect Extraction
Author: Zhiyuan Chen ; Arjun Mukherjee ; Bing Liu ; Meichun Hsu ; Malu Castellanos ; Riddhiman Ghosh
Abstract: Aspect extraction is one of the key tasks in sentiment analysis. In recent years, statistical models have been used for the task. However, such models without any domain knowledge often produce aspects that are not interpretable in applications. To tackle the issue, some knowledge-based topic models have been proposed, which allow the user to input some prior domain knowledge to generate coherent aspects. However, existing knowledge-based topic models have several major shortcomings, e.g., little work has been done to incorporate the cannot-link type of knowledge or to automatically adjust the number of topics based on domain knowledge. This paper proposes a more advanced topic model, called MC-LDA (LDA with m-set and c-set), to address these problems, which is based on an Extended generalized Pólya urn (E-GPU) model (which is also proposed in this paper). Experiments on real-life product reviews from a variety of domains show that MCLDA outperforms the existing state-of-the-art models markedly.
6 0.067896716 135 emnlp-2013-Monolingual Marginal Matching for Translation Model Adaptation
7 0.067387514 42 emnlp-2013-Building Specialized Bilingual Lexicons Using Large Scale Background Knowledge
8 0.065476388 92 emnlp-2013-Growing Multi-Domain Glossaries from a Few Seeds using Probabilistic Topic Models
9 0.065369084 99 emnlp-2013-Implicit Feature Detection via a Constrained Topic Model and SVM
10 0.060387172 94 emnlp-2013-Identifying Manipulated Offerings on Review Portals
11 0.052430041 202 emnlp-2013-Where Not to Eat? Improving Public Policy by Predicting Hygiene Inspections Using Online Reviews
12 0.051518548 168 emnlp-2013-Semi-Supervised Feature Transformation for Dependency Parsing
13 0.048179999 194 emnlp-2013-Unsupervised Relation Extraction with General Domain Knowledge
14 0.040050332 138 emnlp-2013-Naive Bayes Word Sense Induction
15 0.038968861 1 emnlp-2013-A Constrained Latent Variable Model for Coreference Resolution
17 0.037793893 46 emnlp-2013-Classifying Message Board Posts with an Extracted Lexicon of Patient Attributes
18 0.0369813 63 emnlp-2013-Discourse Level Explanatory Relation Extraction from Product Reviews Using First-Order Logic
19 0.035803538 10 emnlp-2013-A Multi-Teraflop Constituency Parser using GPUs
20 0.034951575 53 emnlp-2013-Cross-Lingual Discriminative Learning of Sequence Models with Posterior Regularization
topicId topicWeight
[(0, -0.144), (1, 0.016), (2, -0.067), (3, -0.051), (4, 0.069), (5, 0.034), (6, 0.044), (7, 0.036), (8, -0.004), (9, -0.105), (10, 0.002), (11, -0.224), (12, -0.049), (13, -0.071), (14, 0.099), (15, 0.043), (16, -0.16), (17, 0.011), (18, -0.065), (19, -0.043), (20, 0.09), (21, -0.065), (22, 0.16), (23, -0.108), (24, 0.064), (25, 0.066), (26, 0.007), (27, -0.054), (28, -0.133), (29, -0.066), (30, -0.102), (31, -0.058), (32, 0.187), (33, -0.086), (34, 0.084), (35, 0.106), (36, 0.032), (37, -0.072), (38, 0.076), (39, 0.059), (40, -0.066), (41, 0.088), (42, 0.039), (43, -0.015), (44, 0.012), (45, 0.054), (46, -0.062), (47, -0.063), (48, -0.031), (49, 0.056)]
simIndex simValue paperId paperTitle
same-paper 1 0.97295678 29 emnlp-2013-Automatic Domain Partitioning for Multi-Domain Learning
Author: Di Wang ; Chenyan Xiong ; William Yang Wang
Abstract: Chenyan Xiong School of Computer Science Carnegie Mellon University Pittsburgh, PA 15213, USA cx@ c s . cmu .edu William Yang Wang School of Computer Science Carnegie Mellon University Pittsburgh, PA 15213, USA ww@ cmu .edu might not be generalizable to other domains (BenDavid et al., 2006; Ben-David et al., 2010). Multi-Domain learning (MDL) assumes that the domain labels in the dataset are known. However, when there are multiple metadata at- tributes available, it is not always straightforward to select a single best attribute for domain partition, and it is possible that combining more than one metadata attributes (including continuous attributes) can lead to better MDL performance. In this work, we propose an automatic domain partitioning approach that aims at providing better domain identities for MDL. We use a supervised clustering approach that learns the domain distance between data instances , and then cluster the data into better domains for MDL. Our experiment on real multi-domain datasets shows that using our automatically generated domain partition improves over popular MDL methods.
2 0.78474557 120 emnlp-2013-Learning Latent Word Representations for Domain Adaptation using Supervised Word Clustering
Author: Min Xiao ; Feipeng Zhao ; Yuhong Guo
Abstract: Domain adaptation has been popularly studied on exploiting labeled information from a source domain to learn a prediction model in a target domain. In this paper, we develop a novel representation learning approach to address domain adaptation for text classification with automatically induced discriminative latent features, which are generalizable across domains while informative to the prediction task. Specifically, we propose a hierarchical multinomial Naive Bayes model with latent variables to conduct supervised word clustering on labeled documents from both source and target domains, and then use the produced cluster distribution of each word as its latent feature representation for domain adaptation. We train this latent graphical model us- ing a simple expectation-maximization (EM) algorithm. We empirically evaluate the proposed method with both cross-domain document categorization tasks on Reuters-21578 dataset and cross-domain sentiment classification tasks on Amazon product review dataset. The experimental results demonstrate that our proposed approach achieves superior performance compared with alternative methods.
3 0.63933301 169 emnlp-2013-Semi-Supervised Representation Learning for Cross-Lingual Text Classification
Author: Min Xiao ; Yuhong Guo
Abstract: Cross-lingual adaptation aims to learn a prediction model in a label-scarce target language by exploiting labeled data from a labelrich source language. An effective crosslingual adaptation system can substantially reduce the manual annotation effort required in many natural language processing tasks. In this paper, we propose a new cross-lingual adaptation approach for document classification based on learning cross-lingual discriminative distributed representations of words. Specifically, we propose to maximize the loglikelihood of the documents from both language domains under a cross-lingual logbilinear document model, while minimizing the prediction log-losses of labeled documents. We conduct extensive experiments on cross-lingual sentiment classification tasks of Amazon product reviews. Our experimental results demonstrate the efficacy of the pro- posed cross-lingual adaptation approach.
4 0.60026348 136 emnlp-2013-Multi-Domain Adaptation for SMT Using Multi-Task Learning
Author: Lei Cui ; Xilun Chen ; Dongdong Zhang ; Shujie Liu ; Mu Li ; Ming Zhou
Abstract: Domain adaptation for SMT usually adapts models to an individual specific domain. However, it often lacks some correlation among different domains where common knowledge could be shared to improve the overall translation quality. In this paper, we propose a novel multi-domain adaptation approach for SMT using Multi-Task Learning (MTL), with in-domain models tailored for each specific domain and a general-domain model shared by different domains. The parameters of these models are tuned jointly via MTL so that they can learn general knowledge more accurately and exploit domain knowledge better. Our experiments on a largescale English-to-Chinese translation task validate that the MTL-based adaptation approach significantly and consistently improves the translation quality compared to a non-adapted baseline. Furthermore, it also outperforms the individual adaptation of each specific domain.
5 0.4794372 92 emnlp-2013-Growing Multi-Domain Glossaries from a Few Seeds using Probabilistic Topic Models
Author: Stefano Faralli ; Roberto Navigli
Abstract: In this paper we present a minimallysupervised approach to the multi-domain acquisition ofwide-coverage glossaries. We start from a small number of hypernymy relation seeds and bootstrap glossaries from the Web for dozens of domains using Probabilistic Topic Models. Our experiments show that we are able to extract high-precision glossaries comprising thousands of terms and definitions.
7 0.33871871 138 emnlp-2013-Naive Bayes Word Sense Induction
8 0.32907724 168 emnlp-2013-Semi-Supervised Feature Transformation for Dependency Parsing
9 0.32789958 77 emnlp-2013-Exploiting Domain Knowledge in Aspect Extraction
10 0.32222593 196 emnlp-2013-Using Crowdsourcing to get Representations based on Regular Expressions
11 0.31849802 161 emnlp-2013-Rule-Based Information Extraction is Dead! Long Live Rule-Based Information Extraction Systems!
12 0.31683496 94 emnlp-2013-Identifying Manipulated Offerings on Review Portals
14 0.30545479 42 emnlp-2013-Building Specialized Bilingual Lexicons Using Large Scale Background Knowledge
15 0.29785696 202 emnlp-2013-Where Not to Eat? Improving Public Policy by Predicting Hygiene Inspections Using Online Reviews
16 0.29754788 135 emnlp-2013-Monolingual Marginal Matching for Translation Model Adaptation
17 0.29529217 50 emnlp-2013-Combining PCFG-LA Models with Dual Decomposition: A Case Study with Function Labels and Binarization
18 0.29302484 99 emnlp-2013-Implicit Feature Detection via a Constrained Topic Model and SVM
19 0.27275649 142 emnlp-2013-Open-Domain Fine-Grained Class Extraction from Web Search Queries
20 0.26769805 64 emnlp-2013-Discriminative Improvements to Distributional Sentence Similarity
topicId topicWeight
[(3, 0.023), (18, 0.033), (22, 0.085), (30, 0.059), (35, 0.011), (50, 0.014), (51, 0.141), (66, 0.038), (71, 0.038), (75, 0.026), (77, 0.019), (90, 0.408), (96, 0.019)]
simIndex simValue paperId paperTitle
1 0.86367291 155 emnlp-2013-Question Difficulty Estimation in Community Question Answering Services
Author: Jing Liu ; Quan Wang ; Chin-Yew Lin ; Hsiao-Wuen Hon
Abstract: In this paper, we address the problem of estimating question difficulty in community question answering services. We propose a competition-based model for estimating question difficulty by leveraging pairwise comparisons between questions and users. Our experimental results show that our model significantly outperforms a PageRank-based approach. Most importantly, our analysis shows that the text of question descriptions reflects the question difficulty. This implies the possibility of predicting question difficulty from the text of question descriptions.
2 0.82906747 33 emnlp-2013-Automatic Knowledge Acquisition for Case Alternation between the Passive and Active Voices in Japanese
Author: Ryohei Sasano ; Daisuke Kawahara ; Sadao Kurohashi ; Manabu Okumura
Abstract: We present a method for automatically acquiring knowledge for case alternation between the passive and active voices in Japanese. By leveraging several linguistic constraints on alternation patterns and lexical case frames obtained from a large Web corpus, our method aligns a case frame in the passive voice to a corresponding case frame in the active voice and finds an alignment between their cases. We then apply the acquired knowledge to a case alternation task and prove its usefulness.
Author: Eva Maria Vecchi ; Roberto Zamparelli ; Marco Baroni
Abstract: In this study, we use compositional distributional semantic methods to investigate restrictions in adjective ordering. Specifically, we focus on properties distinguishing AdjectiveAdjective-Noun phrases in which there is flexibility in the adjective ordering from those bound to a rigid order. We explore a number of measures extracted from the distributional representation of AAN phrases which may indicate a word order restriction. We find that we are able to distinguish the relevant classes and the correct order based primarily on the degree of modification of the adjectives. Our results offer fresh insight into the semantic properties that determine adjective ordering, building a bridge between syntax and distributional semantics.
same-paper 4 0.74784189 29 emnlp-2013-Automatic Domain Partitioning for Multi-Domain Learning
Author: Di Wang ; Chenyan Xiong ; William Yang Wang
Abstract: Chenyan Xiong School of Computer Science Carnegie Mellon University Pittsburgh, PA 15213, USA cx@ c s . cmu .edu William Yang Wang School of Computer Science Carnegie Mellon University Pittsburgh, PA 15213, USA ww@ cmu .edu might not be generalizable to other domains (BenDavid et al., 2006; Ben-David et al., 2010). Multi-Domain learning (MDL) assumes that the domain labels in the dataset are known. However, when there are multiple metadata at- tributes available, it is not always straightforward to select a single best attribute for domain partition, and it is possible that combining more than one metadata attributes (including continuous attributes) can lead to better MDL performance. In this work, we propose an automatic domain partitioning approach that aims at providing better domain identities for MDL. We use a supervised clustering approach that learns the domain distance between data instances , and then cluster the data into better domains for MDL. Our experiment on real multi-domain datasets shows that using our automatically generated domain partition improves over popular MDL methods.
5 0.69147074 192 emnlp-2013-Unsupervised Induction of Contingent Event Pairs from Film Scenes
Author: Zhichao Hu ; Elahe Rahimtoroghi ; Larissa Munishkina ; Reid Swanson ; Marilyn A. Walker
Abstract: Human engagement in narrative is partially driven by reasoning about discourse relations between narrative events, and the expectations about what is likely to happen next that results from such reasoning. Researchers in NLP have tackled modeling such expectations from a range of perspectives, including treating it as the inference of the CONTINGENT discourse relation, or as a type of common-sense causal reasoning. Our approach is to model likelihood between events by drawing on several of these lines of previous work. We implement and evaluate different unsupervised methods for learning event pairs that are likely to be CONTINGENT on one another. We refine event pairs that we learn from a corpus of film scene descriptions utilizing web search counts, and evaluate our results by collecting human judgments ofcontingency. Our results indicate that the use of web search counts increases the av- , erage accuracy of our best method to 85.64% over a baseline of 50%, as compared to an average accuracy of 75. 15% without web search.
6 0.44232556 48 emnlp-2013-Collective Personal Profile Summarization with Social Networks
7 0.4408237 154 emnlp-2013-Prior Disambiguation of Word Tensors for Constructing Sentence Vectors
8 0.44059122 89 emnlp-2013-Gender Inference of Twitter Users in Non-English Contexts
9 0.43824533 179 emnlp-2013-Summarizing Complex Events: a Cross-Modal Solution of Storylines Extraction and Reconstruction
10 0.43512139 46 emnlp-2013-Classifying Message Board Posts with an Extracted Lexicon of Patient Attributes
11 0.4327085 51 emnlp-2013-Connecting Language and Knowledge Bases with Embedding Models for Relation Extraction
13 0.4313238 152 emnlp-2013-Predicting the Presence of Discourse Connectives
14 0.43004256 189 emnlp-2013-Two-Stage Method for Large-Scale Acquisition of Contradiction Pattern Pairs using Entailment
15 0.42888039 98 emnlp-2013-Image Description using Visual Dependency Representations
16 0.42231658 99 emnlp-2013-Implicit Feature Detection via a Constrained Topic Model and SVM
17 0.42154515 160 emnlp-2013-Relational Inference for Wikification
18 0.42059323 175 emnlp-2013-Source-Side Classifier Preordering for Machine Translation
19 0.4205595 168 emnlp-2013-Semi-Supervised Feature Transformation for Dependency Parsing