emnlp emnlp2010 emnlp2010-23 knowledge-graph by maker-knowledge-mining

23 emnlp-2010-Automatic Keyphrase Extraction via Topic Decomposition


Source: pdf

Author: Zhiyuan Liu ; Wenyi Huang ; Yabin Zheng ; Maosong Sun

Abstract: Existing graph-based ranking methods for keyphrase extraction compute a single importance score for each word via a single random walk. Motivated by the fact that both documents and words can be represented by a mixture of semantic topics, we propose to decompose traditional random walk into multiple random walks specific to various topics. We thus build a Topical PageRank (TPR) on word graph to measure word importance with respect to different topics. After that, given the topic distribution of the document, we further calculate the ranking scores of words and extract the top ranked ones as keyphrases. Experimental results show that TPR outperforms state-of-the-art keyphrase extraction methods on two datasets under various evaluation metrics.

Reference: text


Summary: the most important sentenses genereted by tfidf model

sentIndex sentText sentNum sentScore

1 cn , , Abstract Existing graph-based ranking methods for keyphrase extraction compute a single importance score for each word via a single random walk. [sent-6, score-0.611]

2 After that, given the topic distribution of the document, we further calculate the ranking scores of words and extract the top ranked ones as keyphrases. [sent-9, score-0.233]

3 Experimental results show that TPR outperforms state-of-the-art keyphrase extraction methods on two datasets under various evaluation metrics. [sent-10, score-0.532]

4 Automatic keyphrase extraction is widely used in information retrieval and digital library (Turney, 2000; Nguyen and Kan, 2007). [sent-12, score-0.549]

5 Keyphrase extraction is also an essential step in various tasks of natural language processing such as document categorization, clustering and summarization (Manning and Schutze, 2000). [sent-13, score-0.147]

6 The supervised approach (Turney, 1999) regards keyphrase extraction as a classification task, in which a model is trained to determine whether a candidate phrase is a keyphrase. [sent-15, score-0.557]

7 Supervised methods require a doc366 ument set with human-assigned keyphrases as training set. [sent-16, score-0.534]

8 In Web era, articles increase exponentially and change dynamically, which demands keyphrase extraction to be efficient and adaptable. [sent-17, score-0.553]

9 For example, the words “phrase” and “extraction” will be ranked to be more important in topic “keyphrase extraction”, while the words “graph” and “PageRank” will be more important in topic “random walk”. [sent-31, score-0.257]

10 Good keyphrases should be relevant to the major topics of the given document. [sent-33, score-0.649]

11 An appropriate set of keyphrases should also have a good coverage of the document’s major topics. [sent-38, score-0.534]

12 In graph-based methods, the extracted keyphrases may fall into a single topic of the document and fail to cover other substantial topics of the document. [sent-39, score-0.884]

13 To address the problem, it is intuitive to consider the topics of words and document in random walk for keyphrase extraction. [sent-40, score-0.735]

14 In this paper, we propose to decompose traditional PageRank into multiple PageRanks specific to various topics and obtain the importance scores of words under different top- ics. [sent-41, score-0.17]

15 After that, with the help of the document topics, we can further extract keyphrases that are relevant to the document and at the same time have a good coverage of the document’s major topics. [sent-42, score-0.748]

16 In experiments we find that TPR can extract keyphrases with high relevance and good coverage, which outperforms other baseline methods under various evaluation metrics on two datasets. [sent-44, score-0.554]

17 TPR for keyphrase extraction is a two-stage process: 1. [sent-47, score-0.532]

18 Build a topic interpreter to acquire the topics of words and documents. [sent-48, score-0.272]

19 , 1990); (2) Use unsupervised machine learning techniques to obtain word topics from 367 a large-scale document collection. [sent-56, score-0.212]

20 Since the vocabulary in WordNet cannot cover many words in modern news and research articles, we employ the second approach to build topic interpreters for TPR. [sent-57, score-0.208]

21 These methods, known as latent topic models, derive latent topics from a large-scale document collection according to word occurrence information. [sent-59, score-0.374]

22 In LDA, each word w of a document d is regarded to be generated by first sampling a topic z from d’s topic distribution θ(d) , and then sampling a word from the distribution over words that charac- φ(z) and φ(z) are drawn and β, separately. [sent-64, score-0.358]

23 Using LDA, we can obtain the topic distribution of each word w, namely pr(z|w) for topic z ∈ K. [sent-67, score-0.236]

24 Moreover, using the obtained word topic distributions, we can infer the topic distribution of a new document (Blei et al. [sent-70, score-0.333]

25 3 Topical PageRank for Keyphrase Extraction After building a topic interpreter to acquire the topics of words and documents, we can perform keyphrase extraction for documents via TPR. [sent-73, score-0.804]

26 Given a document d, the process of keyphrase extraction using TPR consists ofthe following four steps which is also illustrated in Fig. [sent-74, score-0.629]

27 Using the topic-specific importance scores of words, rank candidate keyphrases respect to each topic separately. [sent-81, score-0.744]

28 Given the topics of document d, integrate the topic-specific rankings of candidate keyphrases into a final ranking, and the top ranked ones are selected as keyphrases. [sent-83, score-0.792]

29 The document is regarded as a word sequence, and the link weights between words is simply set to the co-occurrence count within a sliding window with maximum W words in the word sequence. [sent-86, score-0.174]

30 It was reported in (Mihalcea and Tarau, 2004) the graph direction does not influence the performance of keyphrase extraction very much. [sent-87, score-0.613]

31 Since keyphrases are usually noun phrases, we only add adjectives and nouns in word graph. [sent-91, score-0.534]

32 The damping fact|oVr | i n sdi tchaete nsu tmhabte rea ocfh v vertex has a probability of (1 λ) to perform random jump to another vertex (w1it −hin λ t)h tios graph. [sent-103, score-0.171]

33 Formally, in the PageRank of a specific topic z, we will assign a topic-specific preference value pz(w) to each word w as its random jump probability with Pw∈V pz(w) = 1. [sent-116, score-0.202]

34 As reported in (Hulth, 2003), most manually assigned keyphrases turn out to be noun phrases. [sent-123, score-0.534]

35 We thus select noun phrases from a document as candidate keyphrases for ranking. [sent-124, score-0.673]

36 The candidate keyphrases of a document is obtained as follows. [sent-125, score-0.656]

37 • pz (w) = pr(z|w), is the probability of topic z given w =or pdr w. [sent-136, score-0.194]

38 369 We further integrate topic-specific rankings of candidate keyphrases into a final ranking and extract top-ranked ones as the keyphrases of the document. [sent-140, score-1.171]

39 Denote the topic distribution of the document d as pr(z|d) for each topic z. [sent-141, score-0.333]

40 For each candidate keyphrase p, we compute cit zs final ranking score as 1In experiments we use Stanford POS Tagger from http : / /nlp . [sent-142, score-0.565]

41 (5) Xz=1 After ranking candidate phrases in descending order of their integrated ranking scores, we select the top M as the keyphrases of document d. [sent-147, score-0.789]

42 1 Datasets To evaluate the performance of TPR for keyphrase extraction, we carry out experiments on two datasets. [sent-149, score-0.482]

43 There are at most 10 keyphrases for each document. [sent-153, score-0.534]

44 Since neither NEWS nor RESEARCH itself is large enough to learn efficient topics, we use the Wikipedia snapshot at March 2008 4 to build topic interpreters with LDA. [sent-158, score-0.149]

45 For the words absent in topic models, we simply set the topic distribution of the word as uniform distribution. [sent-163, score-0.236]

46 2 Evaluation Metrics For evaluation, the words in both standard and extracted keyphrases are reduced to base forms using 2http : / /wanxiao j un 19 7 9 . [sent-165, score-0.554]

47 We note that the ranking order of extracted keyphrases also indicates the method performance. [sent-175, score-0.612]

48 An extraction method will be better than another one if it can rank correct keyphrases higher. [sent-176, score-0.614]

49 However, precision/recall/F-measure does not take the order of extracted keyphrases into account. [sent-177, score-0.554]

50 Bpref is desirable to evaluate the performance considering the order in which the extracted keyphrases are ranked. [sent-180, score-0.554]

51 For a document, if there are R correct keyphrases within M extracted keyphrases by a method, in which r is a correct keyphrase and n is an incorrect keyphrase, Bpref is defined as follows, Bpref =R1X1 −|n ranked hMigher than r|. [sent-181, score-1.591]

52 The other metric is mean reciprocal (7) rank (MRR) (Voorhees, 2000) which is used to evaluate how the first correct keyphrase for each document is ranked. [sent-182, score-0.609]

53 For a document d, rankd is denoted as the rank of the first correct keyphrase with all extracted keyphrases, MRR is defined as follows, MRR =|D1|Xran1kd, (8) where D is the document set for keyphrase extraction. [sent-183, score-1.208]

54 Note that although the evaluation scores of most keyphrase extractors are still lower compared to 5http : / /tartarus . [sent-184, score-0.498]

55 other NLP-tasks, it does not indicate the performance is poor because even different annotators may assign different keyphrases to the same document. [sent-186, score-0.534]

56 In this section, we look into the influences ofthese parameters to TPR for keyphrase extraction. [sent-189, score-0.514]

57 644 Table 1: Influence of window size W when the number of keyphrases M = 10 on NEWS. [sent-223, score-0.556]

58 If the window size W is set too large on RESEARCH, the graph will become full-connected and the weights of links will tend to be equal, which cannot capture the local structure information of abstracts for keyphrase extraction. [sent-227, score-0.567]

59 2 The Number of Topics K We demonstrate the influence of the number of topics K of LDA models in Table 2. [sent-230, score-0.155]

60 The influence is similar on RESEARCH which indicates that LDA is appropriate for obtaining topics of words and documents for TPR to extract keyphrases. [sent-233, score-0.175]

61 631 Table 2: Influence of the number of topics K when the number of keyphrases M = 10 on NEWS. [sent-262, score-0.649]

62 In Table 3 we show the influence when the number of keyphrases M = 10 on NEWS. [sent-284, score-0.574]

63 In keyphrase extraction task, it is required to find the keyphrases that can appropriately represent the topics of the document. [sent-287, score-1.181]

64 It thus does not want to extract those phrases that may appear in multiple topics like common words. [sent-288, score-0.152]

65 587 Table 3: Influence of three preference value settings when the number of keyphrases M = 10 on NEWS. [sent-322, score-0.581]

66 Given the topics of the document d and a word w, We have used various methods to com372 pute similarity including cosine similarity, predictive likelihood and KL-divergence (Heinrich, 2005), among which cosine similarity performs the best on both datasets. [sent-335, score-0.212]

67 Since the average number of manual-labeled keyphrases on NEWS is larger than RESEARCH, we set M = 10 for NEWS and M = 5 for RESEARCH. [sent-338, score-0.534]

68 638 Table 4: Comparing results on NEWS when the number of keyphrases M = 10. [sent-364, score-0.534]

69 583 Table 5: Comparing results on RESEARCH when the number of keyphrases M = 5. [sent-388, score-0.534]

70 However, the performance of LDA under MRR is much worse than TFIDF and PageRank, which indicates LDA fails to correctly extract the first keyphrase earlier than other methods. [sent-395, score-0.502]

71 In the contrast, TPR enjoys the advantages of both LDA and TFIDF/PageRank, by using the external topic information like LDA and internal document structure like TFIDF/PageRank. [sent-397, score-0.215]

72 Each point on the precision-recall curve is evaluated on different numbers of extracted keyphrases M. [sent-399, score-0.554]

73 5 Extracting Example At the end, in Table 6 we show an example of extracted keyphrases using TPR from a news article with title “Arafat Says U. [sent-403, score-0.642]

74 We also mark the number of correctly extracted keyphrases after method name like “(+7)” after TPR. [sent-408, score-0.554]

75 We also illustrate the top 3 topics of the document with their topicspecific keyphrases. [sent-409, score-0.212]

76 By integrating these topic-specific keyphrases considering the proportions of these topics, we obtain the best performance of keyphrase extraction using TPR. [sent-411, score-1.066]

77 In Table 7 we also show the extracted keyphrases of baselines from the same news article. [sent-412, score-0.613]

78 For TFIDF, it only considered the frequency properties of words, and thus highly ranked the phrases with “PLO” which appeared about 16 times in this article, and failed to extract the keyphrases on topic “Israel”. [sent-413, score-0.71]

79 LDA only measured the importance of words using document topics without considering the frequency information of words and thus missed keyphrases with high-frequency words. [sent-414, score-0.767]

80 For example, LDA failed to extract keyphrase “political assassination”, in which the word “assassination” occurred 8 times in this article. [sent-415, score-0.502]

81 5 Related Work In this paper we proposed TPR for keyphrase extraction. [sent-416, score-0.482]

82 A pioneering achievement in keyphrase extraction was carried out in (Turney, 1999) which regarded keyphrase extraction as a classification task. [sent-417, score-1.089]

83 government document, Palestine Liberation Organization leader, political assassination(+), Israeli officials(+), alleged document TPR, Rank 2 Topic on “Israel” PLO leader Yasser Arafat(+), United States(+), Palestine Liberation Organization leader, Israeli officials(+), U. [sent-421, score-0.298]

84 government document, alleged document, Arab government, slaying Wazir, State Department spokesman Charles Redman, Khalil Wazir(+) TPR, Rank 3 Topic on “terrorism” terrorist attacks(+), PLO leader Yasser Arafat(+), Abu Jihad, United States(+), alleged document, U. [sent-423, score-0.3]

85 government document, Palestine Liberation Organization leader, State Department spokesman Charles Redman, political assassination(+), full cooperation Table 6: Extracted keyphrases by TPR. [sent-425, score-0.631]

86 Starting with TextRank (Mihalcea and Tarau, 2004), graph-based ranking methods are becoming the most widely used unsupervised approach for keyphrase extraction. [sent-426, score-0.54]

87 Litvak and Last (2008) applied HITS algorithm on the word graph of a docu- ment for keyphrase extraction. [sent-427, score-0.523]

88 Wan (2008b; 2008a) used a small number of nearest neighbor documents to provide more knowledge for keyphrase extraction. [sent-429, score-0.482]

89 Some methods used clustering techniques on word graphs for keyphrase extraction (Grineva et al. [sent-430, score-0.532]

90 ters, and (2) how to weight each cluster and select keyphrases from the clusters. [sent-438, score-0.534]

91 In recent years, two algorithms were proposed to rank web pages by incorporating topic information of web pages within PageRank (Haveliwala, 2002; Nie et al. [sent-440, score-0.148]

92 , 2006) is, when surfing following a graph link from vertex wi to wj, the ranking score on topic z of wi will have a higher probability to pass to the same topic of wj and have a lower probability to pass to a different topic of wj. [sent-451, score-0.661]

93 We implemented the method and found that the random jumps between topics did not help improve the performance for keyphrase extraction, and did not demonstrate the results of this method. [sent-453, score-0.597]

94 6 Conclusion and Future Work In this paper we propose a new graph-based framework, Topical PageRank, which incorporates topic information within random walk for keyphrase ex- traction. [sent-454, score-0.641]

95 We design to obtain topics using other machine learning methods and from other knowledge bases, and investigate the influence to performance of keyphrase extraction. [sent-460, score-0.637]

96 We plan to consider topic information in other graph-based ranking algorithms such as HITS (Kleinberg, 1999). [sent-463, score-0.176]

97 We will investigate the influence of corpus selection in training LDA for keyphrase extraction using TPR. [sent-467, score-0.572]

98 Clustering to find exemplar terms for keyphrase extraction. [sent-538, score-0.482]

99 The pagerank citation ranking: Bringing order to the web. [sent-580, score-0.3]

100 Collabrank: Towards a collaborative approach to single-document keyphrase extraction. [sent-600, score-0.482]


similar papers computed by tfidf model

tfidf for this paper:

wordName wordTfidf (topN-words)

[('keyphrases', 0.534), ('keyphrase', 0.482), ('tpr', 0.39), ('pagerank', 0.3), ('plo', 0.149), ('lda', 0.134), ('topic', 0.118), ('topics', 0.115), ('document', 0.097), ('leader', 0.097), ('pr', 0.084), ('arafat', 0.082), ('wazir', 0.072), ('yasser', 0.072), ('abu', 0.062), ('assassination', 0.062), ('jihad', 0.062), ('officials', 0.062), ('news', 0.059), ('tfidf', 0.059), ('ranking', 0.058), ('pz', 0.055), ('topical', 0.052), ('bpref', 0.051), ('haveliwala', 0.051), ('redman', 0.051), ('extraction', 0.05), ('palestine', 0.048), ('preference', 0.047), ('wan', 0.045), ('wj', 0.045), ('vertex', 0.045), ('wi', 0.044), ('damping', 0.044), ('alleged', 0.041), ('hulth', 0.041), ('walk', 0.041), ('graph', 0.041), ('influence', 0.04), ('attacks', 0.04), ('israeli', 0.039), ('jump', 0.037), ('mrr', 0.037), ('khalil', 0.035), ('liberation', 0.035), ('spokesman', 0.034), ('government', 0.034), ('influences', 0.032), ('interpreters', 0.031), ('slaying', 0.031), ('tarau', 0.031), ('textrank', 0.031), ('yabin', 0.031), ('ranges', 0.031), ('link', 0.03), ('rank', 0.03), ('article', 0.029), ('hits', 0.029), ('political', 0.029), ('palestinian', 0.027), ('nie', 0.027), ('charles', 0.027), ('xiaojun', 0.026), ('regarded', 0.025), ('candidate', 0.025), ('mihalcea', 0.024), ('united', 0.023), ('abstracts', 0.022), ('latent', 0.022), ('window', 0.022), ('terrorist', 0.022), ('importance', 0.021), ('articles', 0.021), ('ranked', 0.021), ('anette', 0.021), ('grineva', 0.021), ('guerrillas', 0.021), ('interpreter', 0.021), ('jianguo', 0.021), ('leaders', 0.021), ('litvak', 0.021), ('offices', 0.021), ('pageranks', 0.021), ('particulary', 0.021), ('pdr', 0.021), ('squad', 0.021), ('zhiyuan', 0.021), ('xiao', 0.021), ('extracted', 0.02), ('extract', 0.02), ('israel', 0.019), ('zheng', 0.019), ('decompose', 0.018), ('maosong', 0.018), ('terrorism', 0.018), ('turney', 0.018), ('acquire', 0.018), ('digital', 0.017), ('phrases', 0.017), ('scores', 0.016)]

similar papers list:

simIndex simValue paperId paperTitle

same-paper 1 1.0000001 23 emnlp-2010-Automatic Keyphrase Extraction via Topic Decomposition

Author: Zhiyuan Liu ; Wenyi Huang ; Yabin Zheng ; Maosong Sun

Abstract: Existing graph-based ranking methods for keyphrase extraction compute a single importance score for each word via a single random walk. Motivated by the fact that both documents and words can be represented by a mixture of semantic topics, we propose to decompose traditional random walk into multiple random walks specific to various topics. We thus build a Topical PageRank (TPR) on word graph to measure word importance with respect to different topics. After that, given the topic distribution of the document, we further calculate the ranking scores of words and extract the top ranked ones as keyphrases. Experimental results show that TPR outperforms state-of-the-art keyphrase extraction methods on two datasets under various evaluation metrics.

2 0.10965269 100 emnlp-2010-Staying Informed: Supervised and Semi-Supervised Multi-View Topical Analysis of Ideological Perspective

Author: Amr Ahmed ; Eric Xing

Abstract: With the proliferation of user-generated articles over the web, it becomes imperative to develop automated methods that are aware of the ideological-bias implicit in a document collection. While there exist methods that can classify the ideological bias of a given document, little has been done toward understanding the nature of this bias on a topical-level. In this paper we address the problem ofmodeling ideological perspective on a topical level using a factored topic model. We develop efficient inference algorithms using Collapsed Gibbs sampling for posterior inference, and give various evaluations and illustrations of the utility of our model on various document collections with promising results. Finally we give a Metropolis-Hasting inference algorithm for a semi-supervised extension with decent results.

3 0.099303707 58 emnlp-2010-Holistic Sentiment Analysis Across Languages: Multilingual Supervised Latent Dirichlet Allocation

Author: Jordan Boyd-Graber ; Philip Resnik

Abstract: In this paper, we develop multilingual supervised latent Dirichlet allocation (MLSLDA), a probabilistic generative model that allows insights gleaned from one language’s data to inform how the model captures properties of other languages. MLSLDA accomplishes this by jointly modeling two aspects of text: how multilingual concepts are clustered into thematically coherent topics and how topics associated with text connect to an observed regression variable (such as ratings on a sentiment scale). Concepts are represented in a general hierarchical framework that is flexible enough to express semantic ontologies, dictionaries, clustering constraints, and, as a special, degenerate case, conventional topic models. Both the topics and the regression are discovered via posterior inference from corpora. We show MLSLDA can build topics that are consistent across languages, discover sensible bilingual lexical correspondences, and leverage multilingual corpora to better predict sentiment. Sentiment analysis (Pang and Lee, 2008) offers the promise of automatically discerning how people feel about a product, person, organization, or issue based on what they write online, which is potentially of great value to businesses and other organizations. However, the vast majority of sentiment resources and algorithms are limited to a single language, usually English (Wilson, 2008; Baccianella and Sebastiani, 2010). Since no single language captures a majority of the content online, adopting such a limited approach in an increasingly global community risks missing important details and trends that might only be available when text in multiple languages is taken into account. 45 Philip Resnik Department of Linguistics and UMIACS University of Maryland College Park, MD re snik@umd .edu Up to this point, multiple languages have been addressed in sentiment analysis primarily by transferring knowledge from a resource-rich language to a less rich language (Banea et al., 2008), or by ignoring differences in languages via translation into English (Denecke, 2008). These approaches are limited to a view of sentiment that takes place through an English-centric lens, and they ignore the potential to share information between languages. Ideally, learning sentiment cues holistically, across languages, would result in a richer and more globally consistent picture. In this paper, we introduce Multilingual Supervised Latent Dirichlet Allocation (MLSLDA), a model for sentiment analysis on a multilingual corpus. MLSLDA discovers a consistent, unified picture of sentiment across multiple languages by learning “topics,” probabilistic partitions of the vocabulary that are consistent in terms of both meaning and relevance to observed sentiment. Our approach makes few assumptions about available resources, requiring neither parallel corpora nor machine translation. The rest of the paper proceeds as follows. In Section 1, we describe the probabilistic tools that we use to create consistent topics bridging across languages and the MLSLDA model. In Section 2, we present the inference process. We discuss our set of semantic bridges between languages in Section 3, and our experiments in Section 4 demonstrate that this approach functions as an effective multilingual topic model, discovers sentiment-biased topics, and uses multilingual corpora to make better sentiment predictions across languages. Sections 5 and 6 discuss related research and discusses future work, respectively. ProcMe IdTi,n Mgsas ofsa tchehu 2se0t1t0s, C UoSnAfe,r 9e-n1ce1 o Onc Etombepri 2ic0a1l0 M. ?ec th2o0d1s0 i Ans Nsaotcuiartaioln La fonrg Cuaogmep Purtoatcieosnsainlg L,in pgagueis ti 4c5s–5 , 1 Predictions from Multilingual Topics As its name suggests, MLSLDA is an extension of Latent Dirichlet allocation (LDA) (Blei et al., 2003), a modeling approach that takes a corpus of unannotated documents as input and produces two outputs, a set of “topics” and assignments of documents to topics. Both the topics and the assignments are probabilistic: a topic is represented as a probability distribution over words in the corpus, and each document is assigned a probability distribution over all the topics. Topic models built on the foundations of LDA are appealing for sentiment analysis because the learned topics can cluster together sentimentbearing words, and because topic distributions are a parsimonious way to represent a document.1 LDA has been used to discover latent structure in text (e.g. for discourse segmentation (Purver et al., 2006) and authorship (Rosen-Zvi et al., 2004)). MLSLDA extends the approach by ensuring that this latent structure the underlying topics is consistent across languages. We discuss multilingual topic modeling in Section 1. 1, and in Section 1.2 we show how this enables supervised regression regardless of a document’s language. — — 1.1 Capturing Semantic Correlations Topic models posit a straightforward generative process that creates an observed corpus. For each docu- ment d, some distribution θd over unobserved topics is chosen. Then, for each word position in the document, a topic z is selected. Finally, the word for that position is generated by selecting from the topic indexed by z. (Recall that in LDA, a “topic” is a distribution over words). In monolingual topic models, the topic distribution is usually drawn from a Dirichlet distribution. Using Dirichlet distributions makes it easy to specify sparse priors, and it also simplifies posterior inference because Dirichlet distributions are conjugate to multinomial distributions. However, drawing topics from Dirichlet distributions will not suffice if our vocabulary includes multiple languages. If we are working with English, German, and Chinese at the same time, a Dirichlet prior has no way to favor distributions z such that p(good|z), p(gut|z), and 1The latter property has also made LDA popular for information retrieval (Wei and Croft, 2006)). 46 p(h aˇo|z) all tend to be high at the same time, or low at hth ˇaeo same lti tmened. tMoo bree generally, et sheam structure oorf our model must encourage topics to be consistent across languages, and Dirichlet distributions cannot encode correlations between elements. One possible solution to this problem is to use the multivariate normal distribution, which can produce correlated multinomials (Blei and Lafferty, 2005), in place of the Dirichlet distribution. This has been done successfully in multilingual settings (Cohen and Smith, 2009). However, such models complicate inference by not being conjugate. Instead, we appeal to tree-based extensions of the Dirichlet distribution, which has been used to induce correlation in semantic ontologies (Boyd-Graber et al., 2007) and to encode clustering constraints (Andrzejewski et al., 2009). The key idea in this approach is to assume the vocabularies of all languages are organized according to some shared semantic structure that can be represented as a tree. For concreteness in this section, we will use WordNet (Miller, 1990) as the representation of this multilingual semantic bridge, since it is well known, offers convenient and intuitive terminology, and demonstrates the full flexibility of our approach. However, the model we describe generalizes to any tree-structured rep- resentation of multilingual knowledge; we discuss some alternatives in Section 3. WordNet organizes a vocabulary into a rooted, directed acyclic graph of nodes called synsets, short for “synonym sets.” A synset is a child of another synset if it satisfies a hyponomy relationship; each child “is a” more specific instantiation of its parent concept (thus, hyponomy is often called an “isa” relationship). For example, a “dog” is a “canine” is an “animal” is a “living thing,” etc. As an approximation, it is not unreasonable to assume that WordNet’s structure of meaning is language independent, i.e. the concept encoded by a synset can be realized using terms in different languages that share the same meaning. In practice, this organization has been used to create many alignments of international WordNets to the original English WordNet (Ordan and Wintner, 2007; Sagot and Fiˇ ser, 2008; Isahara et al., 2008). Using the structure of WordNet, we can now describe a generative process that produces a distribution over a multilingual vocabulary, which encourages correlations between words with similar meanings regardless of what language each word is in. For each synset h, we create a multilingual word distribution for that synset as follows: 1. Draw transition probabilities βh ∼ Dir (τh) 2. Draw stop probabilities ωh ∼ Dir∼ (κ Dhi)r 3. For each language l, draw emission probabilities for that synset φh,l ∼ Dir (πh,l) . For conciseness in the rest of the paper, we will refer to this generative process as multilingual Dirichlet hierarchy, or MULTDIRHIER(τ, κ, π) .2 Each observed token can be viewed as the end result of a sequence of visited synsets λ. At each node in the tree, the path can end at node iwith probability ωi,1, or it can continue to a child synset with probability ωi,0. If the path continues to another child synset, it visits child j with probability βi,j. If the path ends at a synset, it generates word k with probability φi,l,k.3 The probability of a word being emitted from a path with visited synsets r and final synset h in language lis therefore p(w, λ = r, h|l, β, ω, φ) = (iY,j)∈rβi,jωi,0(1 − ωh,1)φh,l,w. Note that the stop probability ωh (1) is independent of language, but the emission φh,l is dependent on the language. This is done to prevent the following scenario: while synset A is highly probable in a topic and words in language 1attached to that synset have high probability, words in language 2 have low probability. If this could happen for many synsets in a topic, an entire language would be effectively silenced, which would lead to inconsistent topics (e.g. 2Variables τh, πh,l, and κh are hyperparameters. Their mean is fixed, but their magnitude is sampled during inference (i.e. Pkτhτ,ih,k is constant, but τh,i is not). For the bushier bridges, (Pe.g. dictionary and flat), their mean is uniform. For GermaNet, we took frequencies from two balanced corpora of German and English: the British National Corpus (University of Oxford, 2006) and the Kern Corpus of the Digitales Wo¨rterbuch der Deutschen Sprache des 20. Jahrhunderts project (Geyken, 2007). We took these frequencies and propagated them through the multilingual hierarchy, following LDAWN’s (Boyd-Graber et al., 2007) formulation of information content (Resnik, 1995) as a Bayesian prior. The variance of the priors was initialized to be 1.0, but could be sampled during inference. 3Note that the language and word are taken as given, but the path through the semantic hierarchy is a latent random variable. 47 Topic 1 is about baseball in English and about travel in German). Separating path from emission helps ensure that topics are consistent across languages. Having defined topic distributions in a way that can preserve cross-language correspondences, we now use this distribution within a larger model that can discover cross-language patterns of use that predict sentiment. 1.2 The MLSLDA Model We will view sentiment analysis as a regression problem: given an input document, we want to predict a real-valued observation y that represents the sentiment of a document. Specifically, we build on supervised latent Dirichlet allocation (SLDA, (Blei and McAuliffe, 2007)), which makes predictions based on the topics expressed in a document; this can be thought of projecting the words in a document to low dimensional space of dimension equal to the number of topics. Blei et al. showed that using this latent topic structure can offer improved predictions over regressions based on words alone, and the approach fits well with our current goals, since word-level cues are unlikely to be identical across languages. In addition to text, SLDA has been successfully applied to other domains such as social networks (Chang and Blei, 2009) and image classification (Wang et al., 2009). The key innovation in this paper is to extend SLDA by creating topics that are globally consistent across languages, using the bridging approach above. We express our model in the form of a probabilistic generative latent-variable model that generates documents in multiple languages and assigns a realvalued score to each document. The score comes from a normal distribution whose sum is the dot product between a regression parameter η that encodes the influence of each topic on the observation and a variance σ2. With this model in hand, we use statistical inference to determine the distribution over latent variables that, given the model, best explains observed data. The generative model is as follows: 1. For each topic i= 1. . . K, draw a topic distribution {βi, ωi, φi} from MULTDIRHIER(τ, κ, π). 2. {Foβr each do}cuf mroemn tM Md = 1. . . M with language ld: (a) CDihro(oαse). a distribution over topics θd ∼ (b) For each word in the document n = 1. . . Nd, choose a topic assignment zd,n ∼ Mult (θd) and a path λd,n ending at word wd,n according to Equation 1using {βzd,n , ωzd,n , φzd,n }. 3. Choose a re?sponse variable from y Norm ?η> z¯, σ2?, where z¯ d ≡ N1 PnN=1 zd,n. ∼ Crucially, note that the topics are not independent of the sentiment task; the regression encourages terms with similar effects on the observation y to be in the same topic. The consistency of topics described above allows the same regression to be done for the entire corpus regardless of the language of the underlying document. 2 Inference Finding the model parameters most likely to explain the data is a problem of statistical inference. We employ stochastic EM (Diebolt and Ip, 1996), using a Gibbs sampler for the E-step to assign words to paths and topics. After randomly initializing the topics, we alternate between sampling the topic and path of a word (zd,n, λd,n) and finding the regression parameters η that maximize the likelihood. We jointly sample the topic and path conditioning on all of the other path and document assignments in the corpus, selecting a path and topic with probability p(zn = k, λn = r|z−n , λ−n, wn , η, σ, Θ) = p(yd|z, η, σ)p(λn = r|zn = k, λ−n, wn, τ, p(zn = k|z−n, α) . κ, π) (2) Each of these three terms reflects a different influence on the topics from the vocabulary structure, the document’s topics, and the response variable. In the next paragraphs, we will expand each of them to derive the full conditional topic distribution. As discussed in Section 1.1, the structure of the topic distribution encourages terms with the same meaning to be in the same topic, even across languages. During inference, we marginalize over possible multinomial distributions β, ω, and φ, using the observed transitions from ito j in topic k; Tk,i,j, stop counts in synset iin topic k, Ok,i,0; continue counts in synsets iin topic k, Ok,i,1 ; and emission counts in synset iin language lin topic k, Fk,i,l. The 48 Multilingual Topics Text Documents Sentiment Prediction Figure 1: Graphical model representing MLSLDA. Shaded nodes represent observations, plates denote replication, and lines show probabilistic dependencies. probability of taking a path r is then p(λn = r|zn = k, λ−n) = (iY,j)∈r PBj0Bk,ik,j,i,+j0 τ+i,j τi,jPs∈0O,1k,Oi,1k,+i,s ω+i ωi,s! |(iY,j)∈rP{zP} Tran{szitiPon Ok,rend,0 + ωrend Fk,rend,wn + πrend,}l Ps∈0,1Ok,rend,s+ ωrend,sPw0Frend,w0+ πrend,w0 |PEmi{szsiPon} (3) Equation 3 reflects the multilingual aspect of this model. The conditional topic distribution for SLDA (Blei and McAuliffe, 2007) replaces this term with the standard Multinomial-Dirichlet. However, we believe this is the first published SLDA-style model using MCMC inference, as prior work has used variational inference (Blei and McAuliffe, 2007; Chang and Blei, 2009; Wang et al., 2009). Because the observed response variable depends on the topic assignments of a document, the conditional topic distribution is shifted toward topics that explain the observed response. Topics that move the predicted response yˆd toward the true yd will be favored. We drop terms that are constant across all topics for the effect of the response variable, p(yd|z, η, σ) ∝ exp?σ12?yd−PPk0kN0Nd,dk,0kη0k0?Pkη0Nzkd,k0? |??PP{z?P?} . Other wPord{zs’ influence exp

4 0.089694925 48 emnlp-2010-Exploiting Conversation Structure in Unsupervised Topic Segmentation for Emails

Author: Shafiq Joty ; Giuseppe Carenini ; Gabriel Murray ; Raymond T. Ng

Abstract: This work concerns automatic topic segmentation of email conversations. We present a corpus of email threads manually annotated with topics, and evaluate annotator reliability. To our knowledge, this is the first such email corpus. We show how the existing topic segmentation models (i.e., Lexical Chain Segmenter (LCSeg) and Latent Dirichlet Allocation (LDA)) which are solely based on lexical information, can be applied to emails. By pointing out where these methods fail and what any desired model should consider, we propose two novel extensions of the models that not only use lexical information but also exploit finer level conversation structure in a principled way. Empirical evaluation shows that LCSeg is a better model than LDA for segmenting an email thread into topical clusters and incorporating conversation structure into these models improves the performance significantly.

5 0.073213361 6 emnlp-2010-A Latent Variable Model for Geographic Lexical Variation

Author: Jacob Eisenstein ; Brendan O'Connor ; Noah A. Smith ; Eric P. Xing

Abstract: The rapid growth of geotagged social media raises new computational possibilities for investigating geographic linguistic variation. In this paper, we present a multi-level generative model that reasons jointly about latent topics and geographical regions. High-level topics such as “sports” or “entertainment” are rendered differently in each geographic region, revealing topic-specific regional distinctions. Applied to a new dataset of geotagged microblogs, our model recovers coherent topics and their regional variants, while identifying geographic areas of linguistic consistency. The model also enables prediction of an author’s geographic location from raw text, outperforming both text regression and supervised topic models.

6 0.062333375 45 emnlp-2010-Evaluating Models of Latent Document Semantics in the Presence of OCR Errors

7 0.057158217 64 emnlp-2010-Incorporating Content Structure into Text Analysis Applications

8 0.055292513 77 emnlp-2010-Measuring Distributional Similarity in Context

9 0.052977595 109 emnlp-2010-Translingual Document Representations from Discriminative Projections

10 0.046067059 32 emnlp-2010-Context Comparison of Bursty Events in Web Search and Online Media

11 0.042508382 33 emnlp-2010-Cross Language Text Classification by Model Translation and Semi-Supervised Learning

12 0.037960295 85 emnlp-2010-Negative Training Data Can be Harmful to Text Classification

13 0.037549358 102 emnlp-2010-Summarizing Contrastive Viewpoints in Opinionated Text

14 0.037436657 84 emnlp-2010-NLP on Spoken Documents Without ASR

15 0.036213059 34 emnlp-2010-Crouching Dirichlet, Hidden Markov Model: Unsupervised POS Tagging with Context Local Tag Generation

16 0.031819798 73 emnlp-2010-Learning Recurrent Event Queries for Web Search

17 0.029925829 66 emnlp-2010-Inducing Word Senses to Improve Web Search Result Clustering

18 0.029693082 70 emnlp-2010-Jointly Modeling Aspects and Opinions with a MaxEnt-LDA Hybrid

19 0.02853716 124 emnlp-2010-Word Sense Induction Disambiguation Using Hierarchical Random Graphs

20 0.028342508 79 emnlp-2010-Mining Name Translations from Entity Graph Mapping


similar papers computed by lsi model

lsi for this paper:

topicId topicWeight

[(0, 0.109), (1, 0.094), (2, -0.145), (3, -0.082), (4, 0.127), (5, 0.023), (6, -0.054), (7, -0.05), (8, -0.036), (9, -0.062), (10, -0.02), (11, 0.005), (12, -0.091), (13, 0.074), (14, -0.02), (15, 0.037), (16, 0.01), (17, 0.046), (18, 0.102), (19, -0.077), (20, 0.127), (21, 0.0), (22, 0.004), (23, 0.005), (24, 0.094), (25, 0.025), (26, 0.003), (27, 0.036), (28, 0.067), (29, 0.032), (30, -0.014), (31, -0.031), (32, 0.066), (33, -0.14), (34, 0.044), (35, -0.145), (36, 0.092), (37, -0.237), (38, -0.117), (39, -0.112), (40, -0.021), (41, -0.096), (42, 0.111), (43, -0.014), (44, 0.143), (45, -0.128), (46, 0.006), (47, -0.012), (48, -0.04), (49, 0.05)]

similar papers list:

simIndex simValue paperId paperTitle

same-paper 1 0.93389916 23 emnlp-2010-Automatic Keyphrase Extraction via Topic Decomposition

Author: Zhiyuan Liu ; Wenyi Huang ; Yabin Zheng ; Maosong Sun

Abstract: Existing graph-based ranking methods for keyphrase extraction compute a single importance score for each word via a single random walk. Motivated by the fact that both documents and words can be represented by a mixture of semantic topics, we propose to decompose traditional random walk into multiple random walks specific to various topics. We thus build a Topical PageRank (TPR) on word graph to measure word importance with respect to different topics. After that, given the topic distribution of the document, we further calculate the ranking scores of words and extract the top ranked ones as keyphrases. Experimental results show that TPR outperforms state-of-the-art keyphrase extraction methods on two datasets under various evaluation metrics.

2 0.6019035 48 emnlp-2010-Exploiting Conversation Structure in Unsupervised Topic Segmentation for Emails

Author: Shafiq Joty ; Giuseppe Carenini ; Gabriel Murray ; Raymond T. Ng

Abstract: This work concerns automatic topic segmentation of email conversations. We present a corpus of email threads manually annotated with topics, and evaluate annotator reliability. To our knowledge, this is the first such email corpus. We show how the existing topic segmentation models (i.e., Lexical Chain Segmenter (LCSeg) and Latent Dirichlet Allocation (LDA)) which are solely based on lexical information, can be applied to emails. By pointing out where these methods fail and what any desired model should consider, we propose two novel extensions of the models that not only use lexical information but also exploit finer level conversation structure in a principled way. Empirical evaluation shows that LCSeg is a better model than LDA for segmenting an email thread into topical clusters and incorporating conversation structure into these models improves the performance significantly.

3 0.55679166 45 emnlp-2010-Evaluating Models of Latent Document Semantics in the Presence of OCR Errors

Author: Daniel Walker ; William B. Lund ; Eric K. Ringger

Abstract: Models of latent document semantics such as the mixture of multinomials model and Latent Dirichlet Allocation have received substantial attention for their ability to discover topical semantics in large collections of text. In an effort to apply such models to noisy optical character recognition (OCR) text output, we endeavor to understand the effect that character-level noise can have on unsupervised topic modeling. We show the effects both with document-level topic analysis (document clustering) and with word-level topic analysis (LDA) on both synthetic and real-world OCR data. As expected, experimental results show that performance declines as word error rates increase. Common techniques for alleviating these problems, such as filtering low-frequency words, are successful in enhancing model quality, but exhibit failure trends similar to models trained on unpro- cessed OCR output in the case of LDA. To our knowledge, this study is the first of its kind.

4 0.5444169 100 emnlp-2010-Staying Informed: Supervised and Semi-Supervised Multi-View Topical Analysis of Ideological Perspective

Author: Amr Ahmed ; Eric Xing

Abstract: With the proliferation of user-generated articles over the web, it becomes imperative to develop automated methods that are aware of the ideological-bias implicit in a document collection. While there exist methods that can classify the ideological bias of a given document, little has been done toward understanding the nature of this bias on a topical-level. In this paper we address the problem ofmodeling ideological perspective on a topical level using a factored topic model. We develop efficient inference algorithms using Collapsed Gibbs sampling for posterior inference, and give various evaluations and illustrations of the utility of our model on various document collections with promising results. Finally we give a Metropolis-Hasting inference algorithm for a semi-supervised extension with decent results.

5 0.42742667 6 emnlp-2010-A Latent Variable Model for Geographic Lexical Variation

Author: Jacob Eisenstein ; Brendan O'Connor ; Noah A. Smith ; Eric P. Xing

Abstract: The rapid growth of geotagged social media raises new computational possibilities for investigating geographic linguistic variation. In this paper, we present a multi-level generative model that reasons jointly about latent topics and geographical regions. High-level topics such as “sports” or “entertainment” are rendered differently in each geographic region, revealing topic-specific regional distinctions. Applied to a new dataset of geotagged microblogs, our model recovers coherent topics and their regional variants, while identifying geographic areas of linguistic consistency. The model also enables prediction of an author’s geographic location from raw text, outperforming both text regression and supervised topic models.

6 0.41888577 58 emnlp-2010-Holistic Sentiment Analysis Across Languages: Multilingual Supervised Latent Dirichlet Allocation

7 0.31089005 109 emnlp-2010-Translingual Document Representations from Discriminative Projections

8 0.29690275 77 emnlp-2010-Measuring Distributional Similarity in Context

9 0.26388833 110 emnlp-2010-Turbo Parsers: Dependency Parsing by Approximate Variational Inference

10 0.23946358 32 emnlp-2010-Context Comparison of Bursty Events in Web Search and Online Media

11 0.23672929 124 emnlp-2010-Word Sense Induction Disambiguation Using Hierarchical Random Graphs

12 0.220236 91 emnlp-2010-Practical Linguistic Steganography Using Contextual Synonym Substitution and Vertex Colour Coding

13 0.2080999 85 emnlp-2010-Negative Training Data Can be Harmful to Text Classification

14 0.19534045 64 emnlp-2010-Incorporating Content Structure into Text Analysis Applications

15 0.19266759 37 emnlp-2010-Domain Adaptation of Rule-Based Annotators for Named-Entity Recognition Tasks

16 0.18428928 34 emnlp-2010-Crouching Dirichlet, Hidden Markov Model: Unsupervised POS Tagging with Context Local Tag Generation

17 0.16441567 102 emnlp-2010-Summarizing Contrastive Viewpoints in Opinionated Text

18 0.16217609 53 emnlp-2010-Fusing Eye Gaze with Speech Recognition Hypotheses to Resolve Exophoric References in Situated Dialogue

19 0.1610184 105 emnlp-2010-Title Generation with Quasi-Synchronous Grammar

20 0.15975414 122 emnlp-2010-WikiWars: A New Corpus for Research on Temporal Expressions


similar papers computed by lda model

lda for this paper:

topicId topicWeight

[(3, 0.017), (12, 0.037), (29, 0.095), (30, 0.045), (52, 0.015), (56, 0.07), (62, 0.018), (66, 0.093), (72, 0.04), (82, 0.01), (87, 0.417), (89, 0.013)]

similar papers list:

simIndex simValue paperId paperTitle

1 0.80300921 106 emnlp-2010-Top-Down Nearly-Context-Sensitive Parsing

Author: Eugene Charniak

Abstract: We present a new syntactic parser that works left-to-right and top down, thus maintaining a fully-connected parse tree for a few alternative parse hypotheses. All of the commonly used statistical parsers use context-free dynamic programming algorithms and as such work bottom up on the entire sentence. Thus they only find a complete fully connected parse at the very end. In contrast, both subjective and experimental evidence show that people understand a sentence word-to-word as they go along, or close to it. The constraint that the parser keeps one or more fully connected syntactic trees is intended to operationalize this cognitive fact. Our parser achieves a new best result for topdown parsers of 89.4%,a 20% error reduction over the previous single-parser best result for parsers of this type of 86.8% (Roark, 2001) . The improved performance is due to embracing the very large feature set available in exchange for giving up dynamic programming.

2 0.78726041 30 emnlp-2010-Confidence in Structured-Prediction Using Confidence-Weighted Models

Author: Avihai Mejer ; Koby Crammer

Abstract: Confidence-Weighted linear classifiers (CW) and its successors were shown to perform well on binary and multiclass NLP problems. In this paper we extend the CW approach for sequence learning and show that it achieves state-of-the-art performance on four noun phrase chucking and named entity recognition tasks. We then derive few algorithmic approaches to estimate the prediction’s correctness of each label in the output sequence. We show that our approach provides a reliable relative correctness information as it outperforms other alternatives in ranking label-predictions according to their error. We also show empirically that our methods output close to absolute estimation of error. Finally, we show how to use this information to improve active learning.

same-paper 3 0.72733897 23 emnlp-2010-Automatic Keyphrase Extraction via Topic Decomposition

Author: Zhiyuan Liu ; Wenyi Huang ; Yabin Zheng ; Maosong Sun

Abstract: Existing graph-based ranking methods for keyphrase extraction compute a single importance score for each word via a single random walk. Motivated by the fact that both documents and words can be represented by a mixture of semantic topics, we propose to decompose traditional random walk into multiple random walks specific to various topics. We thus build a Topical PageRank (TPR) on word graph to measure word importance with respect to different topics. After that, given the topic distribution of the document, we further calculate the ranking scores of words and extract the top ranked ones as keyphrases. Experimental results show that TPR outperforms state-of-the-art keyphrase extraction methods on two datasets under various evaluation metrics.

4 0.48830363 86 emnlp-2010-Non-Isomorphic Forest Pair Translation

Author: Hui Zhang ; Min Zhang ; Haizhou Li ; Eng Siong Chng

Abstract: This paper studies two issues, non-isomorphic structure translation and target syntactic structure usage, for statistical machine translation in the context of forest-based tree to tree sequence translation. For the first issue, we propose a novel non-isomorphic translation framework to capture more non-isomorphic structure mappings than traditional tree-based and tree-sequence-based translation methods. For the second issue, we propose a parallel space searching method to generate hypothesis using tree-to-string model and evaluate its syntactic goodness using tree-to-tree/tree sequence model. This not only reduces the search complexity by merging spurious-ambiguity translation paths and solves the data sparseness issue in training, but also serves as a syntax-based target language model for better grammatical generation. Experiment results on the benchmark data show our proposed two solutions are very effective, achieving significant performance improvement over baselines when applying to different translation models.

5 0.43227759 25 emnlp-2010-Better Punctuation Prediction with Dynamic Conditional Random Fields

Author: Wei Lu ; Hwee Tou Ng

Abstract: This paper focuses on the task of inserting punctuation symbols into transcribed conversational speech texts, without relying on prosodic cues. We investigate limitations associated with previous methods, and propose a novel approach based on dynamic conditional random fields. Different from previous work, our proposed approach is designed to jointly perform both sentence boundary and sentence type prediction, and punctuation prediction on speech utterances. We performed evaluations on a transcribed conversational speech domain consisting of both English and Chinese texts. Empirical results show that our method outperforms an approach based on linear-chain conditional random fields and other previous approaches.

6 0.43046847 78 emnlp-2010-Minimum Error Rate Training by Sampling the Translation Lattice

7 0.4231309 42 emnlp-2010-Efficient Incremental Decoding for Tree-to-String Translation

8 0.42096603 119 emnlp-2010-We're Not in Kansas Anymore: Detecting Domain Changes in Streams

9 0.41580826 105 emnlp-2010-Title Generation with Quasi-Synchronous Grammar

10 0.40995759 20 emnlp-2010-Automatic Detection and Classification of Social Events

11 0.40519142 84 emnlp-2010-NLP on Spoken Documents Without ASR

12 0.40174076 60 emnlp-2010-Improved Fully Unsupervised Parsing with Zoomed Learning

13 0.40090567 98 emnlp-2010-Soft Syntactic Constraints for Hierarchical Phrase-Based Translation Using Latent Syntactic Distributions

14 0.39899334 69 emnlp-2010-Joint Training and Decoding Using Virtual Nodes for Cascaded Segmentation and Tagging Tasks

15 0.3979845 6 emnlp-2010-A Latent Variable Model for Geographic Lexical Variation

16 0.39629716 35 emnlp-2010-Discriminative Sample Selection for Statistical Machine Translation

17 0.393868 7 emnlp-2010-A Mixture Model with Sharing for Lexical Semantics

18 0.38998556 58 emnlp-2010-Holistic Sentiment Analysis Across Languages: Multilingual Supervised Latent Dirichlet Allocation

19 0.38929129 82 emnlp-2010-Multi-Document Summarization Using A* Search and Discriminative Learning

20 0.3879717 87 emnlp-2010-Nouns are Vectors, Adjectives are Matrices: Representing Adjective-Noun Constructions in Semantic Space