emnlp emnlp2012 emnlp2012-90 knowledge-graph by maker-knowledge-mining
Source: pdf
Author: Lan Du ; Wray Buntine ; Huidong Jin
Abstract: Topic models are increasingly being used for text analysis tasks, often times replacing earlier semantic techniques such as latent semantic analysis. In this paper, we develop a novel adaptive topic model with the ability to adapt topics from both the previous segment and the parent document. For this proposed model, a Gibbs sampler is developed for doing posterior inference. Experimental results show that with topic adaptation, our model significantly improves over existing approaches in terms of perplexity, and is able to uncover clear sequential structure on, for example, Herman Melville’s book “Moby Dick”.
Reference: text
sentIndex sentText sentNum sentScore
1 In this paper, we develop a novel adaptive topic model with the ability to adapt topics from both the previous segment and the parent document. [sent-11, score-0.667]
2 Experimental results show that with topic adaptation, our model significantly improves over existing approaches in terms of perplexity, and is able to uncover clear sequential structure on, for example, Herman Melville’s book “Moby Dick”. [sent-13, score-0.577]
3 1 Introduction Natural language text usually consists of topically structured and coherent components, such as groups of sentences that form paragraphs and groups of paragraphs that form sections. [sent-14, score-0.342]
4 Capturing this structural topical dependency should lead to improved topic modelling. [sent-16, score-0.43]
5 It also seems reasonable to propose that text analysis tasks that involve the structure of a document, for instance, summarisation and segmentation, should also be improved by topic models that better model that structure. [sent-17, score-0.384]
6 Recently, topic models are increasingly being used for text analysis tasks such as summarisa∗This work was partially done when Du was at College of Engineering & Computer Science, the Australian National University when working together with Buntine and Jin there. [sent-18, score-0.355]
7 , 2010), modelling the structural aspects of documents, for instance modelling a document as a set of segments (Du et al. [sent-25, score-0.347]
8 In this paper, we are interested in developing a new topic model which can take into account the structural topic dependency by following the higher level document subject structure, but we hope to retain the general flavour of topic models, where components (e. [sent-32, score-1.181]
9 , 2011)), in which semantically related units are clustered together to form semantically structural segments, we treat documents as sequences of segments (e. [sent-42, score-0.138]
10 In this way, we can model the topic correlation bePLraoncge uadgineg Lse oafr tnhineg 2,0 p1a2g Jeosin 53t C5–o5n4f5e,re Jnecjue Iosnla Enmd,p Kiroicraela, M 1e2t–h1o4ds Ju ilny N 20a1tu2r. [sent-45, score-0.355]
11 Indeed, we were impressed by the improvement in perplexity obtained by the segmented topic model (STM) (Du et al. [sent-51, score-0.486]
12 , 2010), so we considered the problem of whether one can add sequence information into a structured topic model as well. [sent-52, score-0.355]
13 A strictly sequential model would seem unrealistic for some documents, for instance books. [sent-55, score-0.196]
14 A topic model using the strictly sequential model was developed (Du et al. [sent-56, score-0.551]
15 In this paper, we develop an adaptive topic model to go beyond a strictly sequential model while allow some hierarchical influence. [sent-58, score-0.652]
16 These 536 models operate at a finer level than we are considering at a segment (like paragraph or section) level. [sent-64, score-0.207]
17 To make a tool like the HMM work at higher levels, one needs to make stronger assumptions, for instance assigning each sentence a single topic and then topic specific word models can be used: the hidden topic Markov model (Gruber et al. [sent-65, score-1.065]
18 , 2007) that models the transitional topic structure; a global model based on the generalised Mallows model (Chen et al. [sent-66, score-0.355]
19 Researchers have also considered timeseries of topics: various kinds of dynamic topic models, following early work of (Blei and Lafferty, 2006), represent a collection as a sequence of subcollections in epochs. [sent-68, score-0.355]
20 Here, one is modelling the collections over broad epochs, not the structure of a single document that our model considers. [sent-69, score-0.159]
21 2 Background The basic topic model is first presented in Section 2. [sent-74, score-0.355]
22 In seeking to develop a general sequential topic model, we hope to go beyond a strictly sequential model and allow some hierarchical influence. [sent-76, score-0.79]
23 Hierarchical inference (and thus sequential inference) over probability vectors can be handled using the theory of hierarchical Poisson-Dirichlet processes (PDPs). [sent-78, score-0.239]
24 1 The LDA model The benchmark model for topic modelling is latent Dirichlet allocation (LDA) (Blei et al. [sent-82, score-0.476]
25 The latent variables are µi (the topic distribution for a w~ µ document)and z~ (thetopicassignmentsforobserved words), and the model parameter of φ~k’s (word distributions). [sent-85, score-0.438]
26 , 2003)) and the Dirichlet prior α on topic distributions. [sent-94, score-0.355]
27 , nK) where nk is the number of data in with value k and Pk nk = N. [sent-104, score-0.23]
28 Commonly in topic modelling, the Dirichlet distribution is used for discrete probability vectors. [sent-109, score-0.393]
29 The auxiliary variable is the table count1 which is a tk for each nk 1Based on the Chinese Restaurant analogy (Teh et al. [sent-120, score-0.299]
30 537 and it represents the number of “tables” over which the nk “customers” are spread out. [sent-122, score-0.142]
31 Thus the following constraints hold: 0 ≤ tk ≤ nk and tk = 0 iff nk =0. [sent-123, score-0.506]
32 , 2011), where another auxiliary variable is introduced, a so-called table indicator, that for each datum zi indicates whether it is the “head of its table” (recall the nk data are spread over tk tables, each table has one and only one “head”). [sent-147, score-0.39]
33 According to this “table” logic, the number of tables for nk must be the number of data zi that are also head of table, so PiN=1 tk = 1zi=k1ri=1. [sent-149, score-0.368]
34 Moreover, given this definition,P the first constraint of Equation (2) on tk is automatically satisfied. [sent-150, score-0.138]
35 Finally, with tk tables then there must be exactly tk heads of table, and we are indifferent about which data are heads of table, thus p? [sent-151, score-0.327]
36 (4) When using this marginalised likelihood in a Gibbs sampler, the zi themselves are usually latent so also sampled, and we develop a blocked Gibbs sampler for (zi, ri). [sent-162, score-0.3]
37 t~, 3 r r, The proposed Adaptive Topic Model In this section an adaptive topic model (AdaTM) is developed, a fully structured topic model, by using a PDP to simultaneously model the hierarchical and the sequential topic structures. [sent-164, score-1.362]
38 In AdaTM, the two topic structures are captured by drawing topic distributions from the PDPs with two base distributions as follows. [sent-168, score-0.77]
39 j and νi,j−1 word probability vectors as a K W matrix word probability vector for topic k, entries in W-dimensional prior for each word in document i, segment j,position l topic for word wi,j,l φ~k Φ µ 538 Figure 2: The adaptive topic model: is the document topic distribution, ν1 , ν2, . [sent-170, score-1.799]
40 , νJ are the segment topic distributions, and ρ is a set of the mixture weights. [sent-173, score-0.51]
41 tribution νi,j are linearly combined to give a base distribution for the (j 1)th segment’s topic distribution νi,j+1. [sent-174, score-0.431]
42 Call this generative process topic + × adaptation. [sent-178, score-0.355]
43 The graphical representation of AdaTM is shown in Figure 2, and clearly shows the combination of sequence and hierarchy for the topic probabilities. [sent-179, score-0.355]
44 , Pi Mi,k,w vector of W values Mk,w topic count in document isegment j for topic k Ni,j PtopkKic=1 tnoit,aj,lk in document i segment j, i. [sent-193, score-1.109]
45 , ti,j,k tPable count in the CPR for document iand paragPraph j, for topic k that is inherited back to paragraph j − 1 and µi,j−1 . [sent-195, score-0.595]
46 tpaabrlaeg croapunht j ji n− th 1e a aCndPR µ~ for document iand paragraph j,for topic k that is inherited back to the document and µi. [sent-196, score-0.634]
47 ocument i and t~i,j si,j table count vector oPf ti,j,k’s for segment j. [sent-199, score-0.199]
48 This section proposes a blocked Gibbs sampling algorithm based on methods from Chen et al. [sent-203, score-0.126]
49 , customers, dishes and restaurants, correspond to words, topics and segments respectively. [sent-209, score-0.178]
50 The table indicators when known can be used to reconstruct the table counts ti,j,k and si,j,k, and are reconstructed by sampling from them. [sent-222, score-0.151]
51 To complete a formulation suitable for Gibbs sampling, we first compute the marginal distribution of the observations w~ 1:I,1:J (words), the topic assignments z1:I,1:J and the table indicators u1:I,1:J. [sent-224, score-0.452]
52 The Dirichlet integral is used to integrate out the document topic distributions µ1:I and the topicby-words matrix and the joint posterior distribution computed for a PDP is used to recur- Φ~, sively marginalise out the segment topic distributions ν1:I,1:J. [sent-225, score-1.081]
53 For convenience of the formulas, set ti,Ji+1,k = 0 (there is no Ji + 1 segment) and ti,1,k = 0 (the first segment only uses µi). [sent-236, score-0.155]
54 If this word is in topic k at document iand segment j,then it contributes a count to It also indicates if it contributes a new table, or a count to ti0,j,k for the PDP at this node. [sent-238, score-0.902]
55 If it contributes to ti,j,k, then it recurses up to contribute a data count to the PDP for document isegment j − 1. [sent-240, score-0.254]
56 Consequently, oth nee etadbsle a indicator ui,j,l for word wi,j,l must specify whether it contributes a table to all PDP nodes reachable by it in the graph. [sent-242, score-0.129]
57 We define ui,j,l specifically as ui,j,l = (u1, u2) such that u1 ∈ [−1, 0, 1] and u2 ∈ [1, · · · ,j], where u2 indic∈ate [s− segment ndden uoted∈ by n,·o·d·e , νj up to which wi,j,l contributes a table. [sent-243, score-0.248]
58 Now, we are ready to compute the conditional probabilities for jointly sampling topics and table indicators from the model posterior of Equation (5). [sent-246, score-0.251]
59 Since the posterior of Equation (5) does not explicitly mention the ui,j,l’s, they occur indirectly through the table counts, and we can randomly reconstruct them by sampling them uniformly from the space of possibilities. [sent-249, score-0.127]
60 If after sampling u1 = −1, the data contributes a table count up to µi a=nd − so ui,j,l = (u1, u2) = (− 1,j). [sent-257, score-0.195]
61 Otherwise, fth ue data contributes a table count up to the parent PDP for νi,j−1 and we recurse, repeating the sampling process at the parent node. [sent-259, score-0.195]
62 Estimates: learnt values of µi, νi,j, are needed for evaluation, perplexity calculations, etc. [sent-264, score-0.131]
63 φ~k 6 Experiments In the experimental work, we have three objectives: (1) to explore the setting of hyper-parameters, (2) to compare the model with the earlier sequential LDA (SeqLDA) of (Du et al. [sent-266, score-0.196]
64 All the patents in these five datasets are split into paragraphs that are taken as segments, and the sequence of paragraphs in each patent is reserved in order to maintain the original layout. [sent-274, score-0.516]
65 They are split into chapters and/or paragraphs which are treated as segments, and only stop-words are removed. [sent-279, score-0.392]
66 When reporting test perplexities, the held-out perplexity measure (Rosen-Zvi et al. [sent-283, score-0.131]
67 (a) shows how perplexity changes with b; (b) shows how it changes with a. [sent-292, score-0.131]
68 (a) how perplexity changes with λS; (b) how it changes with λT. [sent-294, score-0.131]
69 LDA D is LDA run on whole patents, and LDA P is LDA run on the paragraphs within patents. [sent-333, score-0.171]
70 In addition, we ran another set of experiments by randomly shuffling the order of paragraphs in each patent several times before running AdaTM. [sent-336, score-0.243]
71 The positive difference means randomly shuffling the order of paragraphs indeed increases the perplexity. [sent-339, score-0.171]
72 542 It can further prove that there does exist sequential topic structure in patents, which confirms the finding in (Du et al. [sent-340, score-0.551]
73 To align the topics so visualisations match, the sequential models are initialised using an LDA model built at the chapter level. [sent-350, score-0.295]
74 P0 a08t2-95B316 (a) Evolution of paragraph topics for LDA (b) Topic alignment of LDA versus AdaTM topics for chapters Figure 6: Analysis on “The Prince”. [sent-358, score-0.471]
75 To visualise topic evolution, we use a plot with one colour per topic displayed over the sequence. [sent-360, score-0.736]
76 Figure 6(a) shows this for LDA run on paragraphs of “The Prince”. [sent-361, score-0.171]
77 The proportion of 20 topics is the Y-axis, spread across the unit interval. [sent-362, score-0.126]
78 The paragraphs run along the X-axis, so the topic evolution is clearly displayed. [sent-363, score-0.608]
79 One can see there is no sequential structure in this derived by the LDA model, and similar plots result from “Moby Dick” for LDA. [sent-364, score-0.24]
80 The plots for the other models, chapters or paragraphs, are similar so plots like Figure 6(a) for the other models can be meaningfully compared. [sent-367, score-0.309]
81 Figure 7 then shows the corresponding evolution plots for AdaTM and SeqLDA on chapters and paragraphs. [sent-368, score-0.347]
82 The large improvement in perplexity for AdaTM (see Section 6. [sent-370, score-0.131]
83 2) means that the se543 (a) AdaTM on chapters (b) AdaTM on paragraphs (c) SeqLDA on chapters (d) SeqLDA on paragraphs Figure 7: Topic Evolution on “The Prince”. [sent-372, score-0.784]
84 Note that SeqLDA, while exhibiting slightly stronger sequential structure than AdaTM in these (a) LDA on chapters (b) STM on Chapters (c) AdaTM on Chapters Figure 8: Topic Evolution on “Moby Dick”. [sent-374, score-0.417]
85 figures has significantly worse test perplexity, so its sequential affect is too strong and harming results. [sent-375, score-0.196]
86 So while the LDA to AdaTM/SeqLDA topic correspondences are quite good due to the use of LDA initialisation, the correspondences between AdaTM and SeqLDA have degraded. [sent-378, score-0.355]
87 We see that AdaTM has nearly as good sequential characteristics as SeqLDA. [sent-379, score-0.196]
88 Furthermore, segment topic distribution νi,j of SeqLDA are gradually deviating from the document topic distribution 544 which is not the case for AdaTM. [sent-380, score-1.024]
89 Figure 8 shows similar topic evolution plots for LDA, STM and AdaTM. [sent-382, score-0.481]
90 In contrast, the AdaTM topic evolutions are much clearer for the less frequent topics, as shown in Figure 8(c). [sent-383, score-0.355]
91 Here we briefly discuss topics by their colour: black: Captain Peleg and the business of signing on; yellow: inns, housing, bed; mauve: Queequeg; azure: (around chapters 60-80) details of whales aqua: (peaks at 8, 82, 88) pulpit, schools and mythology of whaling. [sent-385, score-0.32]
92 We see that AdaTM can be used to understand the topics with regards to the sequential structure of a book. [sent-386, score-0.295]
93 In contrast, the sequential nature for LDA and STM is lost in the noise. [sent-387, score-0.196]
94 It can be very interesting to apply the proposed topic models to some text analµi, ysis tasks, such as topic segmentation, summarisation, and semantic title evaluation, which are subject to our future work. [sent-388, score-0.71]
95 7 Conclusion A model for adaptive sequential topic modelling has been developed to improve over a simple exchangeable segments model STM (Du et al. [sent-389, score-0.764]
96 , 2010) and a naive sequential model SeqLDA (Du et al. [sent-390, score-0.196]
97 , 2012) in terms of perplexity and its confirmed ability to uncover sequential structure in the topics. [sent-391, score-0.353]
98 One could extract meaningful topics from a book like Herman Melville’s “Moby Dick” and concurrently gain their sequential profile. [sent-392, score-0.295]
99 A segmented topic model based on the two-parameter Poisson-Dirichlet process. [sent-490, score-0.355]
100 PCFGs, topic models, adaptor grammars and learning topical collocations and the structure of proper names. [sent-541, score-0.397]
wordName wordTfidf (topN-words)
[('adatm', 0.459), ('topic', 0.355), ('chapters', 0.221), ('lda', 0.211), ('sequential', 0.196), ('paragraphs', 0.171), ('pdp', 0.17), ('seqlda', 0.17), ('du', 0.157), ('segment', 0.155), ('stm', 0.146), ('tk', 0.138), ('perplexity', 0.131), ('dick', 0.119), ('moby', 0.119), ('nk', 0.115), ('patents', 0.102), ('buntine', 0.102), ('topics', 0.099), ('contributes', 0.093), ('discretek', 0.085), ('gibbs', 0.085), ('document', 0.083), ('evolution', 0.082), ('segments', 0.079), ('modelling', 0.076), ('dirichlet', 0.074), ('blei', 0.074), ('prince', 0.073), ('patent', 0.072), ('blocked', 0.068), ('dirichletk', 0.068), ('marginalised', 0.068), ('zi', 0.064), ('indicators', 0.059), ('sampling', 0.058), ('adaptive', 0.058), ('sampler', 0.055), ('beta', 0.053), ('paragraph', 0.052), ('pdps', 0.051), ('tables', 0.051), ('auxiliary', 0.046), ('newman', 0.046), ('latent', 0.045), ('count', 0.044), ('canberra', 0.044), ('herman', 0.044), ('plots', 0.044), ('equation', 0.043), ('hierarchical', 0.043), ('topical', 0.042), ('australian', 0.041), ('teh', 0.041), ('gruber', 0.04), ('optimise', 0.04), ('distribution', 0.038), ('australia', 0.038), ('jin', 0.038), ('indicator', 0.036), ('iand', 0.035), ('posterior', 0.035), ('reconstruct', 0.034), ('lan', 0.034), ('andll', 0.034), ('arora', 0.034), ('atot', 0.034), ('ckk', 0.034), ('csiro', 0.034), ('dirichletw', 0.034), ('fixl', 0.034), ('hutter', 0.034), ('isegment', 0.034), ('misra', 0.034), ('moreno', 0.034), ('terminologies', 0.034), ('ttaheb', 0.034), ('structural', 0.033), ('customers', 0.033), ('distributions', 0.03), ('barzilay', 0.03), ('bayesian', 0.03), ('au', 0.03), ('summarisation', 0.029), ('ict', 0.029), ('melville', 0.029), ('huidong', 0.029), ('trick', 0.029), ('wray', 0.029), ('yk', 0.028), ('spread', 0.027), ('segmentation', 0.027), ('chen', 0.027), ('initialisation', 0.026), ('mixing', 0.026), ('inherited', 0.026), ('uncover', 0.026), ('colour', 0.026), ('hardisty', 0.026), ('documents', 0.026)]
simIndex simValue paperId paperTitle
same-paper 1 0.99999946 90 emnlp-2012-Modelling Sequential Text with an Adaptive Topic Model
Author: Lan Du ; Wray Buntine ; Huidong Jin
Abstract: Topic models are increasingly being used for text analysis tasks, often times replacing earlier semantic techniques such as latent semantic analysis. In this paper, we develop a novel adaptive topic model with the ability to adapt topics from both the previous segment and the parent document. For this proposed model, a Gibbs sampler is developed for doing posterior inference. Experimental results show that with topic adaptation, our model significantly improves over existing approaches in terms of perplexity, and is able to uncover clear sequential structure on, for example, Herman Melville’s book “Moby Dick”.
2 0.24646086 8 emnlp-2012-A Phrase-Discovering Topic Model Using Hierarchical Pitman-Yor Processes
Author: Robert Lindsey ; William Headden ; Michael Stipicevic
Abstract: Topic models traditionally rely on the bagof-words assumption. In data mining applications, this often results in end-users being presented with inscrutable lists of topical unigrams, single words inferred as representative of their topics. In this article, we present a hierarchical generative probabilistic model of topical phrases. The model simultaneously infers the location, length, and topic of phrases within a corpus and relaxes the bagof-words assumption within phrases by using a hierarchy of Pitman-Yor processes. We use Markov chain Monte Carlo techniques for approximate inference in the model and perform slice sampling to learn its hyperparameters. We show via an experiment on human subjects that our model finds substantially better, more interpretable topical phrases than do competing models.
3 0.2198232 89 emnlp-2012-Mixed Membership Markov Models for Unsupervised Conversation Modeling
Author: Michael J. Paul
Abstract: Recent work has explored the use of hidden Markov models for unsupervised discourse and conversation modeling, where each segment or block of text such as a message in a conversation is associated with a hidden state in a sequence. We extend this approach to allow each block of text to be a mixture of multiple classes. Under our model, the probability of a class in a text block is a log-linear function of the classes in the previous block. We show that this model performs well at predictive tasks on two conversation data sets, improving thread reconstruction accuracy by up to 15 percentage points over a standard HMM. Additionally, we show quantitatively that the induced word clusters correspond to speech acts more closely than baseline models.
4 0.21535343 49 emnlp-2012-Exploring Topic Coherence over Many Models and Many Topics
Author: Keith Stevens ; Philip Kegelmeyer ; David Andrzejewski ; David Buttler
Abstract: We apply two new automated semantic evaluations to three distinct latent topic models. Both metrics have been shown to align with human evaluations and provide a balance between internal measures of information gain and comparisons to human ratings of coherent topics. We improve upon the measures by introducing new aggregate measures that allows for comparing complete topic models. We further compare the automated measures to other metrics for topic models, comparison to manually crafted semantic tests and document classification. Our experiments reveal that LDA and LSA each have different strengths; LDA best learns descriptive topics while LSA is best at creating a compact semantic representation ofdocuments and words in a corpus.
5 0.20314194 115 emnlp-2012-SSHLDA: A Semi-Supervised Hierarchical Topic Model
Author: Xian-Ling Mao ; Zhao-Yan Ming ; Tat-Seng Chua ; Si Li ; Hongfei Yan ; Xiaoming Li
Abstract: Supervised hierarchical topic modeling and unsupervised hierarchical topic modeling are usually used to obtain hierarchical topics, such as hLLDA and hLDA. Supervised hierarchical topic modeling makes heavy use of the information from observed hierarchical labels, but cannot explore new topics; while unsupervised hierarchical topic modeling is able to detect automatically new topics in the data space, but does not make use of any information from hierarchical labels. In this paper, we propose a semi-supervised hierarchical topic model which aims to explore new topics automatically in the data space while incorporating the information from observed hierarchical labels into the modeling process, called SemiSupervised Hierarchical Latent Dirichlet Allocation (SSHLDA). We also prove that hLDA and hLLDA are special cases of SSHLDA. We . conduct experiments on Yahoo! Answers and ODP datasets, and assess the performance in terms of perplexity and clustering. The experimental results show that predictive ability of SSHLDA is better than that of baselines, and SSHLDA can also achieve significant improvement over baselines for clustering on the FScore measure.
6 0.14767028 19 emnlp-2012-An Entity-Topic Model for Entity Linking
7 0.11227107 51 emnlp-2012-Extracting Opinion Expressions with semi-Markov Conditional Random Fields
8 0.11020843 29 emnlp-2012-Concurrent Acquisition of Word Meaning and Lexical Categories
9 0.10689532 47 emnlp-2012-Explore Person Specific Evidence in Web Person Name Disambiguation
10 0.09506043 48 emnlp-2012-Exploring Adaptor Grammars for Native Language Identification
11 0.075254291 60 emnlp-2012-Generative Goal-Driven User Simulation for Dialog Management
12 0.068592794 1 emnlp-2012-A Bayesian Model for Learning SCFGs with Discontiguous Rules
13 0.067603722 33 emnlp-2012-Discovering Diverse and Salient Threads in Document Collections
14 0.066324249 114 emnlp-2012-Revisiting the Predictability of Language: Response Completion in Social Media
15 0.057936687 43 emnlp-2012-Exact Sampling and Decoding in High-Order Hidden Markov Models
16 0.049464367 130 emnlp-2012-Unambiguity Regularization for Unsupervised Learning of Probabilistic Grammars
17 0.042182997 24 emnlp-2012-Biased Representation Learning for Domain Adaptation
18 0.038959529 78 emnlp-2012-Learning Lexicon Models from Search Logs for Query Expansion
19 0.038812835 91 emnlp-2012-Monte Carlo MCMC: Efficient Inference by Approximate Sampling
20 0.038168907 3 emnlp-2012-A Coherence Model Based on Syntactic Patterns
topicId topicWeight
[(0, 0.189), (1, 0.098), (2, 0.105), (3, 0.219), (4, -0.394), (5, 0.205), (6, -0.097), (7, -0.157), (8, -0.158), (9, -0.014), (10, -0.059), (11, -0.006), (12, 0.169), (13, 0.067), (14, 0.031), (15, 0.018), (16, -0.104), (17, -0.031), (18, 0.022), (19, -0.041), (20, 0.004), (21, -0.03), (22, -0.05), (23, -0.015), (24, -0.01), (25, -0.057), (26, 0.008), (27, -0.003), (28, -0.033), (29, 0.037), (30, -0.015), (31, 0.015), (32, -0.03), (33, 0.038), (34, 0.009), (35, -0.046), (36, -0.018), (37, 0.009), (38, 0.04), (39, 0.038), (40, 0.032), (41, 0.032), (42, 0.045), (43, -0.025), (44, -0.061), (45, -0.043), (46, 0.012), (47, 0.001), (48, -0.002), (49, -0.079)]
simIndex simValue paperId paperTitle
same-paper 1 0.97655743 90 emnlp-2012-Modelling Sequential Text with an Adaptive Topic Model
Author: Lan Du ; Wray Buntine ; Huidong Jin
Abstract: Topic models are increasingly being used for text analysis tasks, often times replacing earlier semantic techniques such as latent semantic analysis. In this paper, we develop a novel adaptive topic model with the ability to adapt topics from both the previous segment and the parent document. For this proposed model, a Gibbs sampler is developed for doing posterior inference. Experimental results show that with topic adaptation, our model significantly improves over existing approaches in terms of perplexity, and is able to uncover clear sequential structure on, for example, Herman Melville’s book “Moby Dick”.
2 0.89971381 115 emnlp-2012-SSHLDA: A Semi-Supervised Hierarchical Topic Model
Author: Xian-Ling Mao ; Zhao-Yan Ming ; Tat-Seng Chua ; Si Li ; Hongfei Yan ; Xiaoming Li
Abstract: Supervised hierarchical topic modeling and unsupervised hierarchical topic modeling are usually used to obtain hierarchical topics, such as hLLDA and hLDA. Supervised hierarchical topic modeling makes heavy use of the information from observed hierarchical labels, but cannot explore new topics; while unsupervised hierarchical topic modeling is able to detect automatically new topics in the data space, but does not make use of any information from hierarchical labels. In this paper, we propose a semi-supervised hierarchical topic model which aims to explore new topics automatically in the data space while incorporating the information from observed hierarchical labels into the modeling process, called SemiSupervised Hierarchical Latent Dirichlet Allocation (SSHLDA). We also prove that hLDA and hLLDA are special cases of SSHLDA. We . conduct experiments on Yahoo! Answers and ODP datasets, and assess the performance in terms of perplexity and clustering. The experimental results show that predictive ability of SSHLDA is better than that of baselines, and SSHLDA can also achieve significant improvement over baselines for clustering on the FScore measure.
3 0.89641356 8 emnlp-2012-A Phrase-Discovering Topic Model Using Hierarchical Pitman-Yor Processes
Author: Robert Lindsey ; William Headden ; Michael Stipicevic
Abstract: Topic models traditionally rely on the bagof-words assumption. In data mining applications, this often results in end-users being presented with inscrutable lists of topical unigrams, single words inferred as representative of their topics. In this article, we present a hierarchical generative probabilistic model of topical phrases. The model simultaneously infers the location, length, and topic of phrases within a corpus and relaxes the bagof-words assumption within phrases by using a hierarchy of Pitman-Yor processes. We use Markov chain Monte Carlo techniques for approximate inference in the model and perform slice sampling to learn its hyperparameters. We show via an experiment on human subjects that our model finds substantially better, more interpretable topical phrases than do competing models.
4 0.82407027 49 emnlp-2012-Exploring Topic Coherence over Many Models and Many Topics
Author: Keith Stevens ; Philip Kegelmeyer ; David Andrzejewski ; David Buttler
Abstract: We apply two new automated semantic evaluations to three distinct latent topic models. Both metrics have been shown to align with human evaluations and provide a balance between internal measures of information gain and comparisons to human ratings of coherent topics. We improve upon the measures by introducing new aggregate measures that allows for comparing complete topic models. We further compare the automated measures to other metrics for topic models, comparison to manually crafted semantic tests and document classification. Our experiments reveal that LDA and LSA each have different strengths; LDA best learns descriptive topics while LSA is best at creating a compact semantic representation ofdocuments and words in a corpus.
5 0.71557516 89 emnlp-2012-Mixed Membership Markov Models for Unsupervised Conversation Modeling
Author: Michael J. Paul
Abstract: Recent work has explored the use of hidden Markov models for unsupervised discourse and conversation modeling, where each segment or block of text such as a message in a conversation is associated with a hidden state in a sequence. We extend this approach to allow each block of text to be a mixture of multiple classes. Under our model, the probability of a class in a text block is a log-linear function of the classes in the previous block. We show that this model performs well at predictive tasks on two conversation data sets, improving thread reconstruction accuracy by up to 15 percentage points over a standard HMM. Additionally, we show quantitatively that the induced word clusters correspond to speech acts more closely than baseline models.
6 0.51117551 19 emnlp-2012-An Entity-Topic Model for Entity Linking
7 0.44752258 29 emnlp-2012-Concurrent Acquisition of Word Meaning and Lexical Categories
8 0.40177491 48 emnlp-2012-Exploring Adaptor Grammars for Native Language Identification
9 0.40133667 47 emnlp-2012-Explore Person Specific Evidence in Web Person Name Disambiguation
10 0.39126402 33 emnlp-2012-Discovering Diverse and Salient Threads in Document Collections
11 0.37189367 60 emnlp-2012-Generative Goal-Driven User Simulation for Dialog Management
12 0.26418743 114 emnlp-2012-Revisiting the Predictability of Language: Response Completion in Social Media
13 0.24442168 51 emnlp-2012-Extracting Opinion Expressions with semi-Markov Conditional Random Fields
14 0.21751484 1 emnlp-2012-A Bayesian Model for Learning SCFGs with Discontiguous Rules
15 0.2143904 43 emnlp-2012-Exact Sampling and Decoding in High-Order Hidden Markov Models
16 0.21270092 78 emnlp-2012-Learning Lexicon Models from Search Logs for Query Expansion
17 0.19185826 130 emnlp-2012-Unambiguity Regularization for Unsupervised Learning of Probabilistic Grammars
18 0.18932569 91 emnlp-2012-Monte Carlo MCMC: Efficient Inference by Approximate Sampling
19 0.18293962 79 emnlp-2012-Learning Syntactic Categories Using Paradigmatic Representations of Word Context
20 0.18244739 9 emnlp-2012-A Sequence Labelling Approach to Quote Attribution
topicId topicWeight
[(2, 0.014), (16, 0.02), (25, 0.017), (34, 0.065), (45, 0.015), (60, 0.051), (63, 0.558), (64, 0.013), (65, 0.014), (70, 0.014), (73, 0.012), (74, 0.054), (76, 0.033), (80, 0.014), (86, 0.016), (95, 0.026)]
simIndex simValue paperId paperTitle
same-paper 1 0.94808114 90 emnlp-2012-Modelling Sequential Text with an Adaptive Topic Model
Author: Lan Du ; Wray Buntine ; Huidong Jin
Abstract: Topic models are increasingly being used for text analysis tasks, often times replacing earlier semantic techniques such as latent semantic analysis. In this paper, we develop a novel adaptive topic model with the ability to adapt topics from both the previous segment and the parent document. For this proposed model, a Gibbs sampler is developed for doing posterior inference. Experimental results show that with topic adaptation, our model significantly improves over existing approaches in terms of perplexity, and is able to uncover clear sequential structure on, for example, Herman Melville’s book “Moby Dick”.
2 0.93868166 94 emnlp-2012-Multiple Aspect Summarization Using Integer Linear Programming
Author: Kristian Woodsend ; Mirella Lapata
Abstract: Multi-document summarization involves many aspects of content selection and surface realization. The summaries must be informative, succinct, grammatical, and obey stylistic writing conventions. We present a method where such individual aspects are learned separately from data (without any hand-engineering) but optimized jointly using an integer linear programme. The ILP framework allows us to combine the decisions of the expert learners and to select and rewrite source content through a mixture of objective setting, soft and hard constraints. Experimental results on the TAC-08 data set show that our model achieves state-of-the-art performance using ROUGE and significantly improves the informativeness of the summaries.
3 0.90969771 100 emnlp-2012-Open Language Learning for Information Extraction
Author: Mausam ; Michael Schmitz ; Stephen Soderland ; Robert Bart ; Oren Etzioni
Abstract: Open Information Extraction (IE) systems extract relational tuples from text, without requiring a pre-specified vocabulary, by identifying relation phrases and associated arguments in arbitrary sentences. However, stateof-the-art Open IE systems such as REVERB and WOE share two important weaknesses (1) they extract only relations that are mediated by verbs, and (2) they ignore context, thus extracting tuples that are not asserted as factual. This paper presents OLLIE, a substantially improved Open IE system that addresses both these limitations. First, OLLIE achieves high yield by extracting relations mediated by nouns, adjectives, and more. Second, a context-analysis step increases precision by including contextual information from the sentence in the extractions. OLLIE obtains 2.7 times the area under precision-yield curve (AUC) compared to REVERB and 1.9 times the AUC of WOEparse. –
4 0.89503086 17 emnlp-2012-An "AI readability" Formula for French as a Foreign Language
Author: Thomas Francois ; Cedrick Fairon
Abstract: This paper present a new readability formula for French as a foreign language (FFL), which relies on 46 textual features representative of the lexical, syntactic, and semantic levels as well as some of the specificities of the FFL context. We report comparisons between several techniques for feature selection and various learning algorithms. Our best model, based on support vector machines (SVM), significantly outperforms previous FFL formulas. We also found that semantic features behave poorly in our case, in contrast with some previous readability studies on English as a first language.
5 0.80869544 97 emnlp-2012-Natural Language Questions for the Web of Data
Author: Mohamed Yahya ; Klaus Berberich ; Shady Elbassuoni ; Maya Ramanath ; Volker Tresp ; Gerhard Weikum
Abstract: The Linked Data initiative comprises structured databases in the Semantic-Web data model RDF. Exploring this heterogeneous data by structured query languages is tedious and error-prone even for skilled users. To ease the task, this paper presents a methodology for translating natural language questions into structured SPARQL queries over linked-data sources. Our method is based on an integer linear program to solve several disambiguation tasks jointly: the segmentation of questions into phrases; the mapping of phrases to semantic entities, classes, and relations; and the construction of SPARQL triple patterns. Our solution harnesses the rich type system provided by knowledge bases in the web of linked data, to constrain our semantic-coherence objective function. We present experiments on both the . in question translation and the resulting query answering.
6 0.66434777 8 emnlp-2012-A Phrase-Discovering Topic Model Using Hierarchical Pitman-Yor Processes
7 0.65422595 20 emnlp-2012-Answering Opinion Questions on Products by Exploiting Hierarchical Organization of Consumer Reviews
8 0.64696127 115 emnlp-2012-SSHLDA: A Semi-Supervised Hierarchical Topic Model
9 0.59339452 103 emnlp-2012-PATTY: A Taxonomy of Relational Patterns with Semantic Types
10 0.55614483 33 emnlp-2012-Discovering Diverse and Salient Threads in Document Collections
11 0.55026835 42 emnlp-2012-Entropy-based Pruning for Phrase-based Machine Translation
12 0.54945356 11 emnlp-2012-A Systematic Comparison of Phrase Table Pruning Techniques
13 0.54803127 27 emnlp-2012-Characterizing Stylistic Elements in Syntactic Structure
14 0.54087406 124 emnlp-2012-Three Dependency-and-Boundary Models for Grammar Induction
15 0.53657293 128 emnlp-2012-Translation Model Based Cross-Lingual Language Model Adaptation: from Word Models to Phrase Models
16 0.52812982 14 emnlp-2012-A Weakly Supervised Model for Sentence-Level Semantic Orientation Analysis with Multiple Experts
17 0.52010703 89 emnlp-2012-Mixed Membership Markov Models for Unsupervised Conversation Modeling
18 0.51736856 3 emnlp-2012-A Coherence Model Based on Syntactic Patterns
19 0.51413143 114 emnlp-2012-Revisiting the Predictability of Language: Response Completion in Social Media
20 0.51368076 19 emnlp-2012-An Entity-Topic Model for Entity Linking