acl acl2012 acl2012-120 knowledge-graph by maker-knowledge-mining

120 acl-2012-Information-theoretic Multi-view Domain Adaptation

Source: pdf

Author: Pei Yang ; Wei Gao ; Qi Tan ; Kam-Fai Wong

Abstract: We use multiple views for cross-domain document classification. The main idea is to strengthen the views’ consistency for target data with source training data by identifying the correlations of domain-specific features from different domains. We present an Information-theoretic Multi-view Adaptation Model (IMAM) based on a multi-way clustering scheme, where word and link clusters can draw together seemingly unrelated domain-specific features from both sides and iteratively boost the consistency between document clusterings based on word and link views. Experiments show that IMAM significantly outperforms state-of-the-art baselines.

Reference: text

Summary: the most important sentenses genereted by tfidf model

sentIndex sentText sentNum sentScore

1 Information-theoretic Multi-view Domain Adaptation Pei Yang1,3, Wei Gao2, Qi Tan1, Kam-Fai Wong3 1South China University of Technology, Guangzhou, China {yangpe i anqi }@ s cut . [sent-1, score-0.042]

2 qa 3The Chinese University of Hong Kong, Shatin, N. [sent-7, score-0.037]

3 , Hong Kong k fwong@ s e Abstract We use multiple views for cross-domain document classification. [sent-9, score-0.332]

4 The main idea is to strengthen the views’ consistency for target data with source training data by identifying the correlations of domain-specific features from different domains. [sent-10, score-0.326]

5 1 Introduction Domain adaptation has been shown useful to many natural language processing applications including document classification (Sarinnapakorn and Kubat, 2007), sentiment classification (Blitzer et al. [sent-13, score-0.39]

6 Documents can be represented by multiple independent sets offeatures such as words and link structures of the documents. [sent-15, score-0.119]

7 Multi-view learning aims to improve classifiers by leveraging the redundancy and consistency among these multiple views (Blum and Mitchell, 1998; R ¨uping and Scheffer, 2005; Abney, 2002). [sent-16, score-0.398]

8 Existing methods were designed for data from single domain, assuming that either view alone is sufficient to predict the target class accurately. [sent-17, score-0.242]

9 hk is largely violated in the setting of domain adaptation where training and test data are drawn from different distributions. [sent-21, score-0.318]

10 , 2003) that combines the two learning paradigms to transfer class information across domains in multiple transformed feature spaces. [sent-24, score-0.143]

11 IMAM exploits a multi-way-clustering-based classification scheme to simultaneously cluster documents, words and links into their respective clusters. [sent-25, score-0.09]

12 In particular, the word and link clusterings can automatically associate the correlated features from different domains. [sent-26, score-0.333]

13 Such correlations bridge the domain gap and enhance the consistency of views for clustering (i. [sent-27, score-0.804]

14 (2007), where they proposed co-clusteringbased classification (CoCC) for adaptation learning. [sent-32, score-0.227]

15 , 2003), where in-domain constraints were added to word clusters to provide the class structure and partial categorization knowledge. [sent-34, score-0.132]

16 c so2c0ia1t2io Ans fso rc Ciatoiomnp fuotart Cio nmaplu Ltiantgiounisatlic Lsi,n pgaugiestsi2c 7s0–274, CODA for adaptation based on co-training (Blum and Mitchell, 1998), which is however a pseudo multi-view algorithm where original data has only one view. [sent-41, score-0.182]

17 (201 1) proposed an instance-level multi-view transfer algorithm that integrates classification loss and view consistency terms based on large margin framework. [sent-44, score-0.495]

18 However, instance-based approach is generally poor since new target features lack support from source data (Blitzer et al. [sent-45, score-0.064]

19 3 Our Model Intuitively, source-specific and target-specific features can be drawn together by mining their co-occurrence with domain-independent (common) features, which helps bridge the distribution gap. [sent-48, score-0.087]

20 Meanwhile, the view consistency on target data can be strengthened if target-specific features are appropriately bundled with source-specific features. [sent-49, score-0.395]

21 Our model leverages the complementary cooperation between different views to yield better adaptation performance. [sent-50, score-0.43]

22 1 Representation Let DS be the source training documents and DT be the unlabeled target documents. [sent-52, score-0.149]

23 Each source document ds ∈ DS is labeled with a unique class label c ∈ C. [sent-54, score-0.243]

24 O∈ur D goal lias teol assign e aac uhn target adsoscu lambeeln ct dt ∈ DT rto g an appropriate class as accurately as possible. [sent-55, score-0.285]

25 Let W be the vocabulary of the entire document collection D = DS∪DT. [sent-56, score-0.118]

26 Let L be the set of all links (hyperlinks or citations) among documents. [sent-57, score-0.045]

27 , a bagodf- ∈wo Drd csa snet b {w} raensde a bag-of-links swets {l}. [sent-60, score-0.03]

28 -Owuorr dmso sdetel { explores multi-way clustering that simultaneously clusters documents, words and links. [sent-61, score-0.263]

29 Let and be the respective clustering of documents, words and links. [sent-62, score-0.192]

30 The clustering functions are defined as CD (d) = for document, CW(w) = Dˆ, Wˆ Lˆ dˆ awˆr efo dre wfinoerdd and CL(l) = lˆd f foorr l dinokc,u wmheenrte, Cdˆ, wˆ and lˆ represent tdhe a corresponding clusters. [sent-63, score-0.264]

31 , 2003) to incorporate the 271 loss from multiple views. [sent-66, score-0.067]

32 balances the effect of word or link clusters from coclustering. [sent-69, score-0.19]

33 Therefore, f(or any w ∈ wˆ , l ∈ d ∈ and c ∈ C, we can calculate a s e∈t wˆ of, clo ∈ndli,ti don ∈al ddi asntrdib cut ∈ion Cs:, q(d| wˆ ), q(c| wˆ ), Eq. [sent-74, score-0.072]

34 Lemma 1(Objective ∑functions) Equation 1 can be turned into the form of alternate minimization: (i) For document clustering, we minimize ∑p(d)ϕD(d,dˆ) + ϕC(Wˆ,Lˆ), where ϕC(Wˆ, Lˆ) is a constant1 and ϕD(d, dˆ) · D(p(w|d)||q(w|dˆ)) Θ = ∑d =α |dˆ)) + (1 − α) · D(p(l |d) | |q(l . [sent-78, score-0.184]

35 (ii) For word and link clustering, we minimize Θ = α∑p(w)ϕW(w, wˆ )+(1−α)∑p(l)ϕL(l,lˆ), ∑w ∑l where for any feature v (e. [sent-79, score-0.185]

36 that ϕC(Wˆ, Lˆ) λ[α(I(C,W) − I(C,Wˆ)) + (1 − α)(I(C,L) − I(C,Lˆ))], =] w[hich is constant since word/link clusters keep fixed during th]e document clustering step. [sent-84, score-0.381]

37 Lemma 12 allows us to alternately reorder either documents or both words and links by fixing the other in such a way that the MI loss in Eq. [sent-85, score-0.164]

38 4 Consistency of Multiple Views In this section, we present how the consistency of document clustering on target data could be enhanced among multiple views, which is the key issue of our multi-view adaptation method. [sent-87, score-0.74]

39 Meanwhile, combines the two views and reallocates ,th Ce documents so that it remains consistent with the view-based clusterings as much as possible. [sent-90, score-0.48]

40 The more consistent the views, the better the document clustering, and then the better the word and link clustering, which creates a positive cycle. [sent-91, score-0.237]

41 1 Disagreement Rate of Views For any document, a consistency indicator function with respect to the two view-based clusterings can be defined as follows (t is omitted for simplicity): 2Due to space limit, the proof of all lemmas will be given in a long version of the paper. [sent-93, score-0.398]

42 By minimizing the disagreement rate on unlabeled data, the error rate of each view can be minimized (so does the overall er- ror). [sent-95, score-0.752]

43 By using the optimization based on Lemma 1, we can show empirically that disagreement rate is monotonically decreased (see Section 5). [sent-98, score-0.372]

44 2 View Combination In practice, view-based document clusterings in Eq. [sent-100, score-0.332]

45 2 directly optimizes view combination and produces the document clustering. [sent-103, score-0.235]

46 Suppose Ω = {FD|FD(d) = ∈ Dˆ} is theS supetp oosfe ea lΩl d =ocu m{Fent| clustering fundctio ∈ns. [sent-105, score-0.192]

47 D }Fo irs any FD ∈ Ω, we obtain the disagreement rate η(FD, CDW ∈∩ Ω CDL ), w obhetarein CDW d∩i CDL edemneontets tahtee clustering resulting ,f wrohmer eth Ce overlap of the viewbased clusterings. [sent-106, score-0.594]

48 Lemma 2 CD always minimizes the disagreement rate for any FD ∈ Ω such that dˆ, dˆ η(CD,CDW ∩ CDL) =F mD∈inΩη(FD,CDW ∩ CDL) Meanwhile, η(CD , CDW ∩ CDL ) = η(CDW , CDL ). [sent-107, score-0.372]

49 Lemma 2 suggests that IMAM always finds the document clustering with the minimal disagreement rate to the overlap of view-based clusterings, and the minimal value ofdisagreement rate equals to the disagreement rate of the view-based clusterings. [sent-108, score-1.207]

50 Table 1: View disagreement rate η and error rate ϵ that decrease with iterations and their Pearson’s correlation γ. [sent-109, score-0.567]

51 , 2000) is an online archive of computer science articles. [sent-117, score-0.045]

52 The documents in the archive are categorized into a hierarchical structure. [sent-118, score-0.097]

53 For each set, we chose two top categories, one as positive class and the other as the negative. [sent-122, score-0.061]

54 For example, the dataset denoted as DA-EC consists of source domain: DA 1(+), EC 1(-); and target domain: DA 2(+), EC 2(-). [sent-125, score-0.064]

55 The classification error rate ϵ is measured as the proportion of misclassified target documents. [sent-126, score-0.304]

56 In order to avoid the infinity values, we applied Laplacian smoothing when computing the KL-divergence. [sent-127, score-0.03]

57 We tuned α, λ and the number of word/link clusters by cross-validation on the training data. [sent-128, score-0.071]

58 Results and Discussions Table 1 shows the monotonic decrease of view disagreement rate η and error rate ϵ with the iterations and their Pearson’s correlation γ is nearly perfectly positive. [sent-129, score-0.684]

59 This indicates that IMAM gradually improves adaptation by strengthening the view consistency. [sent-130, score-0.299]

60 This is achieved by the reinforcement of word and link clusterings that draw together target- and source-specific features that are originally unrelated but co-occur with the common features. [sent-131, score-0.385]

61 We compared IMAM with (1) Transductive SVM (TSVM) (Joachims, 1999) using both words and links features; (2) Co-Training (Blum and Mitchell, 273 Table 2: Comparison of error rate with baselines. [sent-132, score-0.24]

62 , 2007): Co-clusteringbased single-view transfer learner (with text view only); and (4) MVTL-LM (Zhang et al. [sent-134, score-0.199]

63 Co-Training performed a little better than TSVM by boosting the confidence of classifiers built on the distinct views in a comple- mentary way. [sent-137, score-0.276]

64 This is because IMAM effectively leverages distinct and complementary views. [sent-142, score-0.066]

65 Compared to CoCC, using source training data to improve the view consistency on target data is the key competency of IMAM. [sent-143, score-0.395]

66 It suggests that instance-based approach is not effective when the data of different domains are drawn from different feature spaces. [sent-145, score-0.042]

67 Although MVTL-LM regulates view consistency, it cannot identify the associations between target- and source-specific features that is the key to the success of adaptation especially when domain gap is large and less commonality could be found. [sent-146, score-0.465]

68 In contrast, CoCC and IMAM uses multi-way clustering to find such correlations. [sent-147, score-0.192]

69 6 Conclusion We presented a novel feature-level multi-view domain adaptation approach. [sent-148, score-0.276]

70 The thrust is to incor- porate distinct views of document features into the information-theoretic co-clustering framework and strengthen the consistency of views on clustering (i. [sent-149, score-0.999]

71 In Proceedings of the 40th Annual Meeting on Association for Computational Linguistics, pages 360-367. [sent-156, score-0.032]

72 In Proceedings of the 45th Annual Meeting of the Association of Computational Linguistics, pages 440-447. [sent-160, score-0.032]

73 In Proceedings of the 14th International Conference on Artificial Intelligence and Statistics (AISTATS), pages 173181. [sent-165, score-0.032]

74 In Proceedings of the 11th Annual Conference on Computational Learning Theory, pages 92-100. [sent-169, score-0.032]

75 In Proceedings of the 13th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pages 210-219. [sent-178, score-0.032]

76 In Proceedings of the ninth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pages 210-219. [sent-193, score-0.032]

77 In Proceedings of Sixteenth International Conference on Machine Learning, pages 200-209. [sent-197, score-0.032]

78 In Proceedings of the 45th Annual Meeting of the Association of Computational Linguistics, pages 264-271. [sent-201, score-0.032]

79 In Proceedings of the 21st Annual Conference on Computational Learning Theory, pages 403-414. [sent-219, score-0.032]

80 In Proceedings of the 17th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pages 1208-1216. [sent-224, score-0.032]

similar papers computed by tfidf model

tfidf for this paper:

wordName wordTfidf (topN-words)

[('imam', 0.354), ('cocc', 0.339), ('cdl', 0.237), ('disagreement', 0.219), ('views', 0.214), ('clusterings', 0.214), ('clustering', 0.192), ('consistency', 0.184), ('adaptation', 0.182), ('rate', 0.153), ('cdw', 0.135), ('dt', 0.13), ('link', 0.119), ('blum', 0.119), ('fd', 0.119), ('document', 0.118), ('dhillon', 0.118), ('view', 0.117), ('dai', 0.104), ('argd', 0.102), ('cd', 0.094), ('domain', 0.094), ('lemma', 0.09), ('blitzer', 0.087), ('transfer', 0.082), ('clusters', 0.071), ('cora', 0.068), ('sarinnapakorn', 0.068), ('sham', 0.068), ('tsvm', 0.068), ('uping', 0.068), ('loss', 0.067), ('minimize', 0.066), ('ds', 0.064), ('target', 0.064), ('class', 0.061), ('sridharan', 0.059), ('kakade', 0.059), ('mitchell', 0.058), ('meanwhile', 0.056), ('abney', 0.056), ('documents', 0.052), ('unrelated', 0.052), ('qf', 0.05), ('dasgupta', 0.05), ('transductive', 0.048), ('seemingly', 0.048), ('links', 0.045), ('classification', 0.045), ('archive', 0.045), ('strengthen', 0.045), ('bridge', 0.045), ('sigkdd', 0.042), ('let', 0.042), ('functions', 0.042), ('error', 0.042), ('drawn', 0.042), ('gap', 0.042), ('mind', 0.042), ('cut', 0.042), ('daum', 0.04), ('pearson', 0.039), ('mi', 0.037), ('ew', 0.037), ('qa', 0.037), ('minimizing', 0.035), ('leverages', 0.034), ('kong', 0.034), ('xt', 0.034), ('ec', 0.033), ('correlations', 0.033), ('unlabeled', 0.033), ('da', 0.032), ('pages', 0.032), ('distinct', 0.032), ('ce', 0.031), ('commonality', 0.03), ('tahtee', 0.03), ('karthik', 0.03), ('qis', 0.03), ('strengthened', 0.03), ('ddi', 0.03), ('citations', 0.03), ('ionfg', 0.03), ('infinity', 0.03), ('mentary', 0.03), ('csa', 0.03), ('laplacian', 0.03), ('biographies', 0.03), ('blenders', 0.03), ('bollywood', 0.03), ('dre', 0.03), ('aac', 0.03), ('dharmendra', 0.03), ('inderjit', 0.03), ('aistats', 0.03), ('cdo', 0.03), ('competency', 0.03), ('convex', 0.03), ('isd', 0.03)]

similar papers list:

simIndex simValue paperId paperTitle

same-paper 1 1.0000002 120 acl-2012-Information-theoretic Multi-view Domain Adaptation

Author: Pei Yang ; Wei Gao ; Qi Tan ; Kam-Fai Wong

2 0.095107041 61 acl-2012-Cross-Domain Co-Extraction of Sentiment and Topic Lexicons

Author: Fangtao Li ; Sinno Jialin Pan ; Ou Jin ; Qiang Yang ; Xiaoyan Zhu

Abstract: Extracting sentiment and topic lexicons is important for opinion mining. Previous works have showed that supervised learning methods are superior for this task. However, the performance of supervised methods highly relies on manually labeled training data. In this paper, we propose a domain adaptation framework for sentiment- and topic- lexicon co-extraction in a domain of interest where we do not require any labeled data, but have lots of labeled data in another related domain. The framework is twofold. In the first step, we generate a few high-confidence sentiment and topic seeds in the target domain. In the second step, we propose a novel Relational Adaptive bootstraPping (RAP) algorithm to expand the seeds in the target domain by exploiting the labeled source domain data and the relationships between topic and sentiment words. Experimental results show that our domain adaptation framework can extract precise lexicons in the target domain without any annotation.

3 0.086073719 103 acl-2012-Grammar Error Correction Using Pseudo-Error Sentences and Domain Adaptation

Author: Kenji Imamura ; Kuniko Saito ; Kugatsu Sadamitsu ; Hitoshi Nishikawa

Abstract: This paper presents grammar error correction for Japanese particles that uses discriminative sequence conversion, which corrects erroneous particles by substitution, insertion, and deletion. The error correction task is hindered by the difficulty of collecting large error corpora. We tackle this problem by using pseudoerror sentences generated automatically. Furthermore, we apply domain adaptation, the pseudo-error sentences are from the source domain, and the real-error sentences are from the target domain. Experiments show that stable improvement is achieved by using domain adaptation.

4 0.082952932 45 acl-2012-Capturing Paradigmatic and Syntagmatic Lexical Relations: Towards Accurate Chinese Part-of-Speech Tagging

Author: Weiwei Sun ; Hans Uszkoreit

Abstract: From the perspective of structural linguistics, we explore paradigmatic and syntagmatic lexical relations for Chinese POS tagging, an important and challenging task for Chinese language processing. Paradigmatic lexical relations are explicitly captured by word clustering on large-scale unlabeled data and are used to design new features to enhance a discriminative tagger. Syntagmatic lexical relations are implicitly captured by constituent parsing and are utilized via system combination. Experiments on the Penn Chinese Treebank demonstrate the importance of both paradigmatic and syntagmatic relations. Our linguistically motivated approaches yield a relative error reduction of 18% in total over a stateof-the-art baseline.

5 0.081565164 143 acl-2012-Mixing Multiple Translation Models in Statistical Machine Translation

Author: Majid Razmara ; George Foster ; Baskaran Sankaran ; Anoop Sarkar

Abstract: Statistical machine translation is often faced with the problem of combining training data from many diverse sources into a single translation model which then has to translate sentences in a new domain. We propose a novel approach, ensemble decoding, which combines a number of translation systems dynamically at the decoding step. In this paper, we evaluate performance on a domain adaptation setting where we translate sentences from the medical domain. Our experimental results show that ensemble decoding outperforms various strong baselines including mixture models, the current state-of-the-art for domain adaptation in machine translation.

6 0.080144584 208 acl-2012-Unsupervised Relation Discovery with Sense Disambiguation

7 0.078368574 48 acl-2012-Classifying French Verbs Using French and English Lexical Resources

8 0.073174767 62 acl-2012-Cross-Lingual Mixture Model for Sentiment Classification

9 0.070748568 9 acl-2012-A Cost Sensitive Part-of-Speech Tagging: Differentiating Serious Errors from Minor Errors

10 0.070253156 199 acl-2012-Topic Models for Dynamic Translation Model Adaptation

11 0.067491084 144 acl-2012-Modeling Review Comments

12 0.062041715 64 acl-2012-Crosslingual Induction of Semantic Roles

13 0.055494737 85 acl-2012-Event Linking: Grounding Event Reference in a News Archive

14 0.054786913 17 acl-2012-A Novel Burst-based Text Representation Model for Scalable Event Detection

15 0.054425903 83 acl-2012-Error Mining on Dependency Trees

16 0.053752799 117 acl-2012-Improving Word Representations via Global Context and Multiple Word Prototypes

17 0.053279825 203 acl-2012-Translation Model Adaptation for Statistical Machine Translation with Monolingual Topic Information

18 0.052090339 161 acl-2012-Polarity Consistency Checking for Sentiment Dictionaries

19 0.051156037 42 acl-2012-Bootstrapping via Graph Propagation

20 0.049783755 94 acl-2012-Fast Online Training with Frequency-Adaptive Learning Rates for Chinese Word Segmentation and New Word Detection

similar papers computed by lsi model

lsi for this paper:

topicId topicWeight

[(0, -0.162), (1, 0.067), (2, 0.025), (3, -0.022), (4, -0.003), (5, 0.021), (6, -0.01), (7, -0.023), (8, -0.012), (9, -0.06), (10, 0.007), (11, -0.046), (12, 0.004), (13, -0.005), (14, -0.048), (15, 0.062), (16, 0.095), (17, 0.007), (18, 0.011), (19, -0.012), (20, -0.05), (21, -0.024), (22, 0.004), (23, 0.018), (24, 0.065), (25, -0.099), (26, -0.032), (27, -0.026), (28, 0.045), (29, 0.084), (30, 0.027), (31, 0.087), (32, -0.056), (33, 0.005), (34, -0.141), (35, -0.108), (36, 0.079), (37, -0.087), (38, -0.088), (39, -0.105), (40, 0.026), (41, -0.055), (42, -0.059), (43, -0.009), (44, -0.012), (45, 0.067), (46, -0.072), (47, 0.007), (48, -0.006), (49, -0.233)]

similar papers list:

simIndex simValue paperId paperTitle

same-paper 1 0.94762045 120 acl-2012-Information-theoretic Multi-view Domain Adaptation

Author: Pei Yang ; Wei Gao ; Qi Tan ; Kam-Fai Wong

2 0.45813847 103 acl-2012-Grammar Error Correction Using Pseudo-Error Sentences and Domain Adaptation

Author: Kenji Imamura ; Kuniko Saito ; Kugatsu Sadamitsu ; Hitoshi Nishikawa

3 0.44411349 62 acl-2012-Cross-Lingual Mixture Model for Sentiment Classification

Author: Xinfan Meng ; Furu Wei ; Xiaohua Liu ; Ming Zhou ; Ge Xu ; Houfeng Wang

Abstract: The amount of labeled sentiment data in English is much larger than that in other languages. Such a disproportion arouse interest in cross-lingual sentiment classification, which aims to conduct sentiment classification in the target language (e.g. Chinese) using labeled data in the source language (e.g. English). Most existing work relies on machine translation engines to directly adapt labeled data from the source language to the target language. This approach suffers from the limited coverage of vocabulary in the machine translation results. In this paper, we propose a generative cross-lingual mixture model (CLMM) to leverage unlabeled bilingual parallel data. By fitting parameters to maximize the likelihood of the bilingual parallel data, the proposed model learns previously unseen sentiment words from the large bilingual parallel data and improves vocabulary coverage signifi- cantly. Experiments on multiple data sets show that CLMM is consistently effective in two settings: (1) labeled data in the target language are unavailable; and (2) labeled data in the target language are also available.

4 0.43737924 112 acl-2012-Humor as Circuits in Semantic Networks

Author: Igor Labutov ; Hod Lipson

Abstract: This work presents a first step to a general implementation of the Semantic-Script Theory of Humor (SSTH). Of the scarce amount of research in computational humor, no research had focused on humor generation beyond simple puns and punning riddles. We propose an algorithm for mining simple humorous scripts from a semantic network (ConceptNet) by specifically searching for dual scripts that jointly maximize overlap and incongruity metrics in line with Raskin’s Semantic-Script Theory of Humor. Initial results show that a more relaxed constraint of this form is capable of generating humor of deeper semantic content than wordplay riddles. We evaluate the said metrics through a user-assessed quality of the generated two-liners.

5 0.43432558 143 acl-2012-Mixing Multiple Translation Models in Statistical Machine Translation

Author: Majid Razmara ; George Foster ; Baskaran Sankaran ; Anoop Sarkar

6 0.41774589 48 acl-2012-Classifying French Verbs Using French and English Lexical Resources

7 0.41008297 42 acl-2012-Bootstrapping via Graph Propagation

8 0.39923272 14 acl-2012-A Joint Model for Discovery of Aspects in Utterances

9 0.39699939 61 acl-2012-Cross-Domain Co-Extraction of Sentiment and Topic Lexicons

10 0.39256865 181 acl-2012-Spectral Learning of Latent-Variable PCFGs

11 0.39185593 129 acl-2012-Learning High-Level Planning from Text

12 0.38798755 117 acl-2012-Improving Word Representations via Global Context and Multiple Word Prototypes

13 0.36929452 37 acl-2012-Baselines and Bigrams: Simple, Good Sentiment and Topic Classification

14 0.36297655 45 acl-2012-Capturing Paradigmatic and Syntagmatic Lexical Relations: Towards Accurate Chinese Part-of-Speech Tagging

15 0.35959727 163 acl-2012-Prediction of Learning Curves in Machine Translation

16 0.35885748 130 acl-2012-Learning Syntactic Verb Frames using Graphical Models

17 0.35328031 83 acl-2012-Error Mining on Dependency Trees

18 0.35044724 74 acl-2012-Discriminative Pronunciation Modeling: A Large-Margin, Feature-Rich Approach

19 0.34880531 203 acl-2012-Translation Model Adaptation for Statistical Machine Translation with Monolingual Topic Information

20 0.34546107 15 acl-2012-A Meta Learning Approach to Grammatical Error Correction

similar papers computed by lda model

lda for this paper:

topicId topicWeight

[(25, 0.019), (26, 0.03), (27, 0.314), (28, 0.055), (30, 0.026), (37, 0.056), (39, 0.044), (57, 0.035), (74, 0.022), (82, 0.015), (84, 0.024), (85, 0.019), (90, 0.121), (92, 0.079), (94, 0.025), (99, 0.04)]

similar papers list:

simIndex simValue paperId paperTitle

same-paper 1 0.74936855 120 acl-2012-Information-theoretic Multi-view Domain Adaptation

Author: Pei Yang ; Wei Gao ; Qi Tan ; Kam-Fai Wong

2 0.48464024 110 acl-2012-Historical Analysis of Legal Opinions with a Sparse Mixed-Effects Latent Variable Model

Author: William Yang Wang ; Elijah Mayfield ; Suresh Naidu ; Jeremiah Dittmar

Abstract: We propose a latent variable model to enhance historical analysis of large corpora. This work extends prior work in topic modelling by incorporating metadata, and the interactions between the components in metadata, in a general way. To test this, we collect a corpus of slavery-related United States property law judgements sampled from the years 1730 to 1866. We study the language use in these legal cases, with a special focus on shifts in opinions on controversial topics across different regions. Because this is a longitudinal data set, we are also interested in understanding how these opinions change over the course of decades. We show that the joint learning scheme of our sparse mixed-effects model improves on other state-of-the-art generative and discriminative models on the region and time period identification tasks. Experiments show that our sparse mixed-effects model is more accurate quantitatively and qualitatively interesting, and that these improvements are robust across different parameter settings.

3 0.47803828 61 acl-2012-Cross-Domain Co-Extraction of Sentiment and Topic Lexicons

Author: Fangtao Li ; Sinno Jialin Pan ; Ou Jin ; Qiang Yang ; Xiaoyan Zhu

4 0.47754237 174 acl-2012-Semantic Parsing with Bayesian Tree Transducers

Author: Bevan Jones ; Mark Johnson ; Sharon Goldwater

Abstract: Many semantic parsing models use tree transformations to map between natural language and meaning representation. However, while tree transformations are central to several state-of-the-art approaches, little use has been made of the rich literature on tree automata. This paper makes the connection concrete with a tree transducer based semantic parsing model and suggests that other models can be interpreted in a similar framework, increasing the generality of their contributions. In particular, this paper further introduces a variational Bayesian inference algorithm that is applicable to a wide class of tree transducers, producing state-of-the-art semantic parsing results while remaining applicable to any domain employing probabilistic tree transducers.

5 0.47701812 22 acl-2012-A Topic Similarity Model for Hierarchical Phrase-based Translation

Author: Xinyan Xiao ; Deyi Xiong ; Min Zhang ; Qun Liu ; Shouxun Lin

Abstract: Previous work using topic model for statistical machine translation (SMT) explore topic information at the word level. However, SMT has been advanced from word-based paradigm to phrase/rule-based paradigm. We therefore propose a topic similarity model to exploit topic information at the synchronous rule level for hierarchical phrase-based translation. We associate each synchronous rule with a topic distribution, and select desirable rules according to the similarity of their topic distributions with given documents. We show that our model significantly improves the translation performance over the baseline on NIST Chinese-to-English translation experiments. Our model also achieves a better performance and a faster speed than previous approaches that work at the word level.

6 0.47671738 146 acl-2012-Modeling Topic Dependencies in Hierarchical Text Categorization

7 0.47659996 31 acl-2012-Authorship Attribution with Author-aware Topic Models

8 0.47415927 214 acl-2012-Verb Classification using Distributional Similarity in Syntactic and Semantic Structures

9 0.472707 38 acl-2012-Bayesian Symbol-Refined Tree Substitution Grammars for Syntactic Parsing

10 0.47208002 167 acl-2012-QuickView: NLP-based Tweet Search

11 0.47026914 63 acl-2012-Cross-lingual Parse Disambiguation based on Semantic Correspondence

12 0.46913171 80 acl-2012-Efficient Tree-based Approximation for Entailment Graph Learning

13 0.46718746 10 acl-2012-A Discriminative Hierarchical Model for Fast Coreference at Large Scale

14 0.46713582 217 acl-2012-Word Sense Disambiguation Improves Information Retrieval

15 0.4662962 84 acl-2012-Estimating Compact Yet Rich Tree Insertion Grammars

16 0.46629533 140 acl-2012-Machine Translation without Words through Substring Alignment

17 0.46618277 132 acl-2012-Learning the Latent Semantics of a Concept from its Definition

18 0.46590379 156 acl-2012-Online Plagiarized Detection Through Exploiting Lexical, Syntax, and Semantic Information

19 0.46533698 127 acl-2012-Large-Scale Syntactic Language Modeling with Treelets

20 0.4651635 83 acl-2012-Error Mining on Dependency Trees