acl acl2010 acl2010-204 knowledge-graph by maker-knowledge-mining

204 acl-2010-Recommendation in Internet Forums and Blogs


Source: pdf

Author: Jia Wang ; Qing Li ; Yuanzhu Peter Chen ; Zhangxi Lin

Abstract: The variety of engaging interactions among users in social medial distinguishes it from traditional Web media. Such a feature should be utilized while attempting to provide intelligent services to social media participants. In this article, we present a framework to recommend relevant information in Internet forums and blogs using user comments, one of the most representative of user behaviors in online discussion. When incorporating user comments, we consider structural, semantic, and authority information carried by them. One of the most important observation from this work is that semantic contents of user comments can play a fairly different role in a different form of social media. When designing a recommendation system for this purpose, such a difference must be considered with caution.

Reference: text


Summary: the most important sentenses genereted by tfidf model

sentIndex sentText sentNum sentScore

1 cn iq Abstract The variety of engaging interactions among users in social medial distinguishes it from traditional Web media. [sent-6, score-0.334]

2 Such a feature should be utilized while attempting to provide intelligent services to social media participants. [sent-7, score-0.31]

3 In this article, we present a framework to recommend relevant information in Internet forums and blogs using user comments, one of the most representative of user behaviors in online discussion. [sent-8, score-0.568]

4 When incorporating user comments, we consider structural, semantic, and authority information carried by them. [sent-9, score-0.318]

5 One of the most important observation from this work is that semantic contents of user comments can play a fairly different role in a different form of social media. [sent-10, score-0.577]

6 When designing a recommendation system for this purpose, such a difference must be considered with caution. [sent-11, score-0.289]

7 Various engaging interactions among users in social media differ- entiate it from traditional Web sites. [sent-20, score-0.445]

8 Such characteristics should be utilized in attempt to provide intelligent services to social media users. [sent-21, score-0.31]

9 In self-publication, or customer-generated media, a user can publish an article or post news to share with others. [sent-23, score-0.373]

10 Other users can read and comment on the posting and these comments can, in turn, be read and commented on. [sent-24, score-0.989]

11 The user experience with the system can be immensely enhanced with the recommended articles. [sent-38, score-0.29]

12 In this work, we focus on recommendation in Internet fobrant creation, sharing, and collaboration among the users (Ahn et al. [sent-39, score-0.336]

13 In a discussion thread, the original posting is typically followed by other readers’ opinions, in the form of comments. [sent-48, score-0.569]

14 Apparently, there is a need to consider topic evolution in adaptive content-based recommendation and this requires novel techniques in order to capture topic evolution precisely and to prevent drastic topic shifting which returns completely irrelevant articles to users. [sent-54, score-0.825]

15 In this work, we present a framework to recommend relevant information in Internet forums and blogs using user comments, one of the most representative recordings of user behaviors in these forms of social media. [sent-55, score-0.728]

16 We model the relationship among comments and that relative to the original posting using graphs in order to evaluate their combined impact. [sent-58, score-0.747]

17 In addition, the weight of a comment is further enhanced with its content and with the authority of its poster. [sent-59, score-0.524]

18 2 Related Work In a broader context, a related problem is contentbased information recommendation (or filtering). [sent-60, score-0.289]

19 Most information recommender systems select articles based on the contents of the original post- ings. [sent-61, score-0.434]

20 The relevant news selections of these work are determined by the textual similarity between the recommended news and the original news posting. [sent-63, score-0.695]

21 , 1999) combine the news content with numerical user ratings. [sent-67, score-0.341]

22 Lee and Park (Lee and Park, 2007) consider matching between news article attributes and user preferences. [sent-72, score-0.373]

23 Some go even further by ignoring the news contents and only using browsing behaviors of the readers with similar interests (Das et al. [sent-80, score-0.369]

24 TDT consists of breaking the stream of news into individual news stories, monitoring the stories for events that have not been seen before, and categorizing them (Lavrenko and Croft, 2001). [sent-85, score-0.329]

25 A topic is modeled with a language profile deduced by the news. [sent-86, score-0.288]

26 Most existing TDT schemes calculate the similarity between a piece of news and a topic profile to determine its topic relevance (Lavrenko and Croft, 2001) (Yang et al. [sent-87, score-0.555]

27 , 2009) apply TDT techniques to group news for collaborative news recommendation. [sent-90, score-0.292]

28 Most recent researches on information recommendation in social media focus on the blogosphere. [sent-94, score-0.56]

29 That is, the knowledge in the blogosphere is enriched by such engaging interactions among bloggers and readers as posting, commenting and tagging. [sent-98, score-0.314]

30 Prior to this work, the linking structure and user tagging mechanisms in the blogosphere are the most widely adopted ones to model such collective wisdom. [sent-99, score-0.274]

31 Due to the interactions between bloggers and readers, blog recommendation should not limit its input to only blog postings themselves but also incorporate feedbacks from the readers. [sent-107, score-0.756]

32 We first describe the design of our recommendation framework in Section 3. [sent-109, score-0.289]

33 3 System Design In this section, we present a mechanism for recommendation in Internet forums and blogs. [sent-161, score-0.37]

34 Essentially, it builds a topic profile for each original posting along with the comments from readers, and uses this profile to retrieve relevant articles. [sent-163, score-1.257]

35 Then, with such collective wisdom, we use a graph to model the relationship among comments and that relative to the original posting in order to evaluate the impact of each comment. [sent-165, score-0.834]

36 This information along with the original posting and its comments are fed into a synthesizer. [sent-167, score-0.747]

37 The synthesizer balances views from both authors and readers to construct a topic profile to retrieve relevant articles. [sent-168, score-0.477]

38 1 Incorporating Comments In a discussion thread, comments made at different levels reflect the variation of focus of readers. [sent-170, score-0.272]

39 Therefore, recommended articles should reflect their concerns to complement the author’s opinion. [sent-171, score-0.323]

40 1 Authority Scoring Comments Intuitively, each comment may have a different degree of authority determined by the status of its author (Hu et al. [sent-178, score-0.425]

41 We consider the cases that a user replies to a previous posting and that a user quotes a previous posting separately. [sent-184, score-1.22]

42 For user , we use to denote the number of times that has replied to user . [sent-185, score-0.28]

43 We combine them linearly: Further, we normalize the above quantity to record how frequently a user refers to another: ∑푛(푖,푘 ) + 휖 Inline with the Pa∑geRank algorithm, we define the authority of user 3. [sent-187, score-0.458]

44 2 as Differentiating comments with Semantic and Structural relations Next, we construct a similar model in terms of the comments themselves. [sent-189, score-0.45]

45 In this model, we treat the original posting and the comments each as a text node. [sent-190, score-0.747]

46 First, a comment can be made in response to the original posting or at most one earlier comment. [sent-194, score-0.769]

47 In particular, the original posting is the root and all the comments are ordinary nodes. [sent-196, score-0.747]

48 There is an arc (directed edge) from node to node , denoted , if the corresponding comment is made in response to comment (or original posting) . [sent-197, score-0.575]

49 M Figure 2: Multi-relation graph of comments based on the structural and semantic information denoted . [sent-205, score-0.326]

50 There is an arc from node to node , denoted ,if the corresponding comment quotes comment (or original posting) . [sent-206, score-0.575]

51 In some social networking media, a user may have a subset of other users as “friends”. [sent-212, score-0.378]

52 Thus, wf {it 0h, t1h}i s, winhfoosremation and assuming poster has made a comment k for user posting, the final weight of this comment is defined as ’s 2 3. [sent-214, score-0.678]

53 2 Topic Profile Construction Once the weight of comments on one posting is quantified by our models, this information along with the entire discussion thread is fed into a synthesizer to construct a topic profile. [sent-215, score-0.992]

54 It i1s a 훼li )n×e ar com(푡)bi +na 훼tio× n of the contribution by the posting itself, , and that by the comments, . [sent-220, score-0.47]

55 Thus, when the original posting and comments are each considered as a document, this term frequency can be calculated for any term in any document. [sent-225, score-0.813]

56 That is, the contribution of comment score is incorporated into weight calculation of the words in a comment. [sent-227, score-0.291]

57 m ax푤 (푡) m ax푠(푖) Such a treatment of compounded weight is essentially to recognize that readers’ impact on selecting relevant articles and the difference of their influence. [sent-228, score-0.272]

58 260 With the topic profile thus constructed, the retriever returns an ordered list of articles with decreasing relevance to the topic. [sent-230, score-0.496]

59 Note that our approach to differentiate the importance of each comment can be easily incorporated into any generic retrieval model. [sent-231, score-0.318]

60 Given original posting and recommended article , if , for a given generalization threshold ,then B is marked as a generalization. [sent-252, score-0.805]

61 4 Experimental Evaluation To evaluate the effectiveness of our proposed recommendation mechanism, we carry out a series of experiments on two synthetic data sets, collected from Internet forums and blogs, respectively. [sent-257, score-0.37]

62 This data set is constructed by randomly selecting 20 news articles with corresponding reader comments from the Digg Web site and 16,718 news articles from the Reuters news Web site. [sent-259, score-1.045]

63 This simulates the scenario of recommending relevant news from traditional media to social media users for their further reading. [sent-260, score-0.663]

64 The second one is the Blog data set containing 15 blog articles with user comments and 15,1 10 articles obtained from the Myhome Web site 2. [sent-261, score-0.873]

65 6412 453 60 The recommendation engine may return a set of essentially the same articles re-posted at different sites. [sent-270, score-0.462]

66 In our experiments, we define precision and novelty metrics as ∣ 퐶∩ 푅 ∣and where is the subset of the top- articles returned by the recommender, is the set of manually tagged relevant articles, and is the set of manually tagged relevant articles excluding duplicate ones to the original posting. [sent-272, score-0.603]

67 We select the top 10 articles for evaluation assuming most readers only browse up to 10 recommended articles (Karypis, 2001). [sent-273, score-0.595]

68 Next, we study the effect of user authority and its integration to comment weighting. [sent-278, score-0.565]

69 1 Overall Performance As baseline proposals, we also implement two well-known content-based recommendation methods (Bogers and Bosch, 2007). [sent-283, score-0.289]

70 Following the strategy of Bogers and Bosch, relevant articles are selected based on the title and the first 10 sentences of the original postings. [sent-294, score-0.28]

71 Trimming the rest of an article would usually remove relatively less crucial information, which speeds up the recommendation process. [sent-296, score-0.376]

72 Our explanation is that blog articles may not be organized in the inverted pyramid style as strictly as news forum articles. [sent-304, score-0.641]

73 1) the number of the most weighted words to represent the topic, and 2) combination coefficient to determine the contribution of original posting and comments in selecting relevant arti- cles. [sent-307, score-0.802]

74 When is set to 0, the recommended articles only reflect the author’s opinion. [sent-311, score-0.323]

75 When , the suggested articles represent the concerns of readers exclusively. [sent-312, score-0.272]

76 3 Effect of Authority and Comments In this part, we explore the contribution of user authority and comments in social media recom- mender. [sent-323, score-0.814]

77 RUN 1 (Posting): the topic profile is constructed only based on the original posting itself. [sent-328, score-0.81]

78 RUN 2 (Posting+Authority): the topic profile is constructed based on the original posting and participant authority. [sent-330, score-0.81]

79 RUN 3 (Posting+Comment): the topic profile is constructed based on the original posting and its comments. [sent-331, score-0.81]

80 RUN 4 (All): the topic profile is constructed based on the original posting, user authority, and its comments. [sent-332, score-0.48]

81 There is a step- wise performance improvement while integrating user authority, comments and both. [sent-335, score-0.365]

82 With the assistance of user authority and comments, the recommendation precision is improved up to 9. [sent-336, score-0.607]

83 Figure 3: Effect of content, quotation and reply relation Content Relation (CR): only the content relation matrix is used in scoring the comments. [sent-344, score-0.401]

84 Quotation+Reply Relation (QRR): both the quotation and reply relation matrices are used in scoring the comments. [sent-349, score-0.301]

85 For the case of Forum, we observe that incorporating content information adversely affects recommendation precision. [sent-352, score-0.344]

86 Specifically, comments in news forums usually carry much richer structural information than blogs where comments are usually “flat” among themselves. [sent-359, score-0.826]

87 4 Recommendation Interpretation To evaluate the precision of interpreting the relationship between recommended articles and the 263 original posting, the evaluation metric of success rate is defined as where is the number of recommended articles, is the error weight of recommended article . [sent-361, score-0.806]

88 Note that these rates include the errors introduced by the irrelevant articles returned by the retrieval module. [sent-366, score-0.277]

89 Traditional recommendation is essentially a push service to provide information according to the profile of individual or groups of users. [sent-376, score-0.456]

90 In this work, we present a framework for information recommendation in such social media as Internet forums and blogs. [sent-379, score-0.641]

91 This model incorporates information of user status and comment semantics and structures within the entire discussion thread. [sent-380, score-0.434]

92 By combining such information with traditional statistical language models, it is capable of suggesting relevant articles that meet the dynamic nature of a discussion in social media. [sent-382, score-0.468]

93 One important discovery from this work is that, when integrating comment contents, the structural information among comments, and reader relationship, it is crucial to distinguish the characteristics ofvarious forms of social media. [sent-383, score-0.484]

94 The reason is that the role that the semantic content of a comment plays can differ from one form to another. [sent-384, score-0.302]

95 For example, we can also evaluate its effectiveness and costs during the operation of a discussion forum, where the discussion thread is continually updated by new comments and votes. [sent-386, score-0.369]

96 Open user profiles for adaptive news systems: help or harm? [sent-397, score-0.336]

97 An intelligent news recommender agent for filtering and categorizing large volumes of text corpus. [sent-427, score-0.342]

98 An analysis of bloggers, topics and tags for a blog recommender system. [sent-449, score-0.319]

99 A synthetical approach for blog recommendation: Combining trust, social relation, and semantic analysis. [sent-486, score-0.322]

100 News recommender system based on topic detection and tracking. [sent-494, score-0.278]


similar papers computed by tfidf model

tfidf for this paper:

wordName wordTfidf (topN-words)

[('posting', 0.47), ('recommendation', 0.289), ('comment', 0.247), ('comments', 0.225), ('authority', 0.178), ('articles', 0.173), ('profile', 0.167), ('blog', 0.162), ('social', 0.16), ('recommender', 0.157), ('recommended', 0.15), ('news', 0.146), ('user', 0.14), ('topic', 0.121), ('forum', 0.12), ('media', 0.111), ('blogs', 0.108), ('readers', 0.099), ('lavrenko', 0.092), ('reply', 0.089), ('specialization', 0.088), ('quotation', 0.088), ('article', 0.087), ('forums', 0.081), ('blogosphere', 0.078), ('tdt', 0.071), ('retrieval', 0.071), ('collective', 0.056), ('relevant', 0.055), ('content', 0.055), ('internet', 0.055), ('web', 0.054), ('bogers', 0.053), ('cantador', 0.053), ('corso', 0.053), ('esmaili', 0.053), ('postings', 0.053), ('tracking', 0.052), ('contents', 0.052), ('original', 0.052), ('profiles', 0.05), ('thread', 0.05), ('sigir', 0.049), ('relation', 0.049), ('interactions', 0.047), ('discussion', 0.047), ('engaging', 0.047), ('users', 0.047), ('generalization', 0.046), ('behaviors', 0.044), ('weight', 0.044), ('chen', 0.043), ('qing', 0.043), ('claypool', 0.043), ('hayes', 0.043), ('bloggers', 0.043), ('structural', 0.041), ('agarwal', 0.04), ('inverted', 0.04), ('ax', 0.039), ('intelligent', 0.039), ('scoring', 0.038), ('lai', 0.038), ('qiu', 0.038), ('wisdom', 0.038), ('acm', 0.038), ('stories', 0.037), ('matrices', 0.037), ('reader', 0.036), ('avesani', 0.035), ('bellogin', 0.035), ('candan', 0.035), ('digg', 0.035), ('gull', 0.035), ('jia', 0.035), ('joo', 0.035), ('leek', 0.035), ('ponte', 0.035), ('retriever', 0.035), ('southwestern', 0.035), ('synthesizer', 0.035), ('trusting', 0.035), ('yuanzhu', 0.035), ('zhangxi', 0.035), ('li', 0.034), ('traditional', 0.033), ('matrix', 0.033), ('term', 0.033), ('returned', 0.033), ('graph', 0.031), ('novelty', 0.031), ('duplicate', 0.031), ('ahn', 0.031), ('networking', 0.031), ('victor', 0.03), ('keyword', 0.03), ('croft', 0.029), ('denoted', 0.029), ('interests', 0.028), ('brin', 0.028)]

similar papers list:

simIndex simValue paperId paperTitle

same-paper 1 1.0000011 204 acl-2010-Recommendation in Internet Forums and Blogs

Author: Jia Wang ; Qing Li ; Yuanzhu Peter Chen ; Zhangxi Lin

Abstract: The variety of engaging interactions among users in social medial distinguishes it from traditional Web media. Such a feature should be utilized while attempting to provide intelligent services to social media participants. In this article, we present a framework to recommend relevant information in Internet forums and blogs using user comments, one of the most representative of user behaviors in online discussion. When incorporating user comments, we consider structural, semantic, and authority information carried by them. One of the most important observation from this work is that semantic contents of user comments can play a fairly different role in a different form of social media. When designing a recommendation system for this purpose, such a difference must be considered with caution.

2 0.11859902 123 acl-2010-Generating Focused Topic-Specific Sentiment Lexicons

Author: Valentin Jijkoun ; Maarten de Rijke ; Wouter Weerkamp

Abstract: We present a method for automatically generating focused and accurate topicspecific subjectivity lexicons from a general purpose polarity lexicon that allow users to pin-point subjective on-topic information in a set of relevant documents. We motivate the need for such lexicons in the field of media analysis, describe a bootstrapping method for generating a topic-specific lexicon from a general purpose polarity lexicon, and evaluate the quality of the generated lexicons both manually and using a TREC Blog track test set for opinionated blog post retrieval. Although the generated lexicons can be an order of magnitude more selective than the general purpose lexicon, they maintain, or even improve, the performance of an opin- ion retrieval system.

3 0.11797155 174 acl-2010-Modeling Semantic Relevance for Question-Answer Pairs in Web Social Communities

Author: Baoxun Wang ; Xiaolong Wang ; Chengjie Sun ; Bingquan Liu ; Lin Sun

Abstract: Quantifying the semantic relevance between questions and their candidate answers is essential to answer detection in social media corpora. In this paper, a deep belief network is proposed to model the semantic relevance for question-answer pairs. Observing the textual similarity between the community-driven questionanswering (cQA) dataset and the forum dataset, we present a novel learning strategy to promote the performance of our method on the social community datasets without hand-annotating work. The experimental results show that our method outperforms the traditional approaches on both the cQA and the forum corpora.

4 0.11486632 129 acl-2010-Growing Related Words from Seed via User Behaviors: A Re-Ranking Based Approach

Author: Yabin Zheng ; Zhiyuan Liu ; Lixing Xie

Abstract: Motivated by Google Sets, we study the problem of growing related words from a single seed word by leveraging user behaviors hiding in user records of Chinese input method. Our proposed method is motivated by the observation that the more frequently two words cooccur in user records, the more related they are. First, we utilize user behaviors to generate candidate words. Then, we utilize search engine to enrich candidate words with adequate semantic features. Finally, we reorder candidate words according to their semantic relatedness to the seed word. Experimental results on a Chinese input method dataset show that our method gains better performance. 1

5 0.10725599 22 acl-2010-A Unified Graph Model for Sentence-Based Opinion Retrieval

Author: Binyang Li ; Lanjun Zhou ; Shi Feng ; Kam-Fai Wong

Abstract: There is a growing research interest in opinion retrieval as on-line users’ opinions are becoming more and more popular in business, social networks, etc. Practically speaking, the goal of opinion retrieval is to retrieve documents, which entail opinions or comments, relevant to a target subject specified by the user’s query. A fundamental challenge in opinion retrieval is information representation. Existing research focuses on document-based approaches and documents are represented by bag-of-word. However, due to loss of contextual information, this representation fails to capture the associative information between an opinion and its corresponding target. It cannot distinguish different degrees of a sentiment word when associated with different targets. This in turn seriously affects opinion retrieval performance. In this paper, we propose a sentence-based approach based on a new information representa- , tion, namely topic-sentiment word pair, to capture intra-sentence contextual information between an opinion and its target. Additionally, we consider inter-sentence information to capture the relationships among the opinions on the same topic. Finally, the two types of information are combined in a unified graph-based model, which can effectively rank the documents. Compared with existing approaches, experimental results on the COAE08 dataset showed that our graph-based model achieved significant improvement. 1

6 0.10292864 56 acl-2010-Bridging SMT and TM with Translation Recommendation

7 0.098503381 112 acl-2010-Extracting Social Networks from Literary Fiction

8 0.07895156 167 acl-2010-Learning to Adapt to Unknown Users: Referring Expression Generation in Spoken Dialogue Systems

9 0.078610688 171 acl-2010-Metadata-Aware Measures for Answer Summarization in Community Question Answering

10 0.077016704 200 acl-2010-Profiting from Mark-Up: Hyper-Text Annotations for Guided Parsing

11 0.074337624 79 acl-2010-Cross-Lingual Latent Topic Extraction

12 0.06789238 215 acl-2010-Speech-Driven Access to the Deep Web on Mobile Devices

13 0.066632383 157 acl-2010-Last but Definitely Not Least: On the Role of the Last Sentence in Automatic Polarity-Classification

14 0.064598396 187 acl-2010-Optimising Information Presentation for Spoken Dialogue Systems

15 0.064446807 218 acl-2010-Structural Semantic Relatedness: A Knowledge-Based Method to Named Entity Disambiguation

16 0.063413963 122 acl-2010-Generating Fine-Grained Reviews of Songs from Album Reviews

17 0.062834471 18 acl-2010-A Study of Information Retrieval Weighting Schemes for Sentiment Analysis

18 0.062075444 10 acl-2010-A Latent Dirichlet Allocation Method for Selectional Preferences

19 0.060196154 15 acl-2010-A Semi-Supervised Key Phrase Extraction Approach: Learning from Title Phrases through a Document Semantic Network

20 0.059471909 85 acl-2010-Detecting Experiences from Weblogs


similar papers computed by lsi model

lsi for this paper:

topicId topicWeight

[(0, -0.158), (1, 0.091), (2, -0.119), (3, 0.01), (4, -0.008), (5, -0.07), (6, -0.037), (7, -0.001), (8, -0.021), (9, -0.036), (10, -0.006), (11, -0.013), (12, -0.01), (13, -0.061), (14, 0.044), (15, 0.07), (16, -0.053), (17, -0.054), (18, -0.041), (19, -0.016), (20, -0.047), (21, -0.16), (22, 0.111), (23, 0.044), (24, 0.008), (25, 0.005), (26, -0.029), (27, -0.03), (28, -0.002), (29, -0.001), (30, 0.015), (31, 0.077), (32, -0.007), (33, 0.154), (34, -0.118), (35, -0.104), (36, -0.081), (37, -0.088), (38, -0.082), (39, -0.088), (40, -0.048), (41, 0.064), (42, 0.073), (43, -0.031), (44, -0.004), (45, -0.136), (46, 0.018), (47, -0.141), (48, 0.048), (49, -0.037)]

similar papers list:

simIndex simValue paperId paperTitle

same-paper 1 0.94468361 204 acl-2010-Recommendation in Internet Forums and Blogs

Author: Jia Wang ; Qing Li ; Yuanzhu Peter Chen ; Zhangxi Lin

Abstract: The variety of engaging interactions among users in social medial distinguishes it from traditional Web media. Such a feature should be utilized while attempting to provide intelligent services to social media participants. In this article, we present a framework to recommend relevant information in Internet forums and blogs using user comments, one of the most representative of user behaviors in online discussion. When incorporating user comments, we consider structural, semantic, and authority information carried by them. One of the most important observation from this work is that semantic contents of user comments can play a fairly different role in a different form of social media. When designing a recommendation system for this purpose, such a difference must be considered with caution.

2 0.58078361 112 acl-2010-Extracting Social Networks from Literary Fiction

Author: David Elson ; Nicholas Dames ; Kathleen McKeown

Abstract: We present a method for extracting social networks from literature, namely, nineteenth-century British novels and serials. We derive the networks from dialogue interactions, and thus our method depends on the ability to determine when two characters are in conversation. Our approach involves character name chunking, quoted speech attribution and conversation detection given the set of quotes. We extract features from the social networks and examine their correlation with one another, as well as with metadata such as the novel’s setting. Our results provide evidence that the majority of novels in this time period do not fit two characterizations provided by literacy scholars. Instead, our results suggest an alternative explanation for differences in social networks.

3 0.52937442 129 acl-2010-Growing Related Words from Seed via User Behaviors: A Re-Ranking Based Approach

Author: Yabin Zheng ; Zhiyuan Liu ; Lixing Xie

Abstract: Motivated by Google Sets, we study the problem of growing related words from a single seed word by leveraging user behaviors hiding in user records of Chinese input method. Our proposed method is motivated by the observation that the more frequently two words cooccur in user records, the more related they are. First, we utilize user behaviors to generate candidate words. Then, we utilize search engine to enrich candidate words with adequate semantic features. Finally, we reorder candidate words according to their semantic relatedness to the seed word. Experimental results on a Chinese input method dataset show that our method gains better performance. 1

4 0.51221281 224 acl-2010-Talking NPCs in a Virtual Game World

Author: Tina Kluwer ; Peter Adolphs ; Feiyu Xu ; Hans Uszkoreit ; Xiwen Cheng

Abstract: This paper describes the KomParse system, a natural-language dialog system in the three-dimensional virtual world Twinity. In order to fulfill the various communication demands between nonplayer characters (NPCs) and users in such an online virtual world, the system realizes a flexible and hybrid approach combining knowledge-intensive domainspecific question answering, task-specific and domain-specific dialog with robust chatbot-like chitchat.

5 0.51174575 82 acl-2010-Demonstration of a Prototype for a Conversational Companion for Reminiscing about Images

Author: Yorick Wilks ; Roberta Catizone ; Alexiei Dingli ; Weiwei Cheng

Abstract: This paper describes an initial prototype demonstrator of a Companion, designed as a platform for novel approaches to the following: 1) The use of Information Extraction (IE) techniques to extract the content of incoming dialogue utterances after an Automatic Speech Recognition (ASR) phase, 2) The conversion of the input to Resource Descriptor Format (RDF) to allow the generation of new facts from existing ones, under the control of a Dialogue Manger (DM), that also has access to stored knowledge and to open knowledge accessed in real time from the web, all in RDF form, 3) A DM implemented as a stack and network virtual machine that models mixed initiative in dialogue control, and 4) A tuned dialogue act detector based on corpus evidence. The prototype platform was evaluated, and we describe this briefly; it is also designed to support more extensive forms of emotion detection carried by both speech and lexical content, as well as extended forms of machine learning.

6 0.50769657 254 acl-2010-Using Speech to Reply to SMS Messages While Driving: An In-Car Simulator User Study

7 0.50045288 176 acl-2010-Mood Patterns and Affective Lexicon Access in Weblogs

8 0.49594527 179 acl-2010-Now, Where Was I? Resumption Strategies for an In-Vehicle Dialogue System

9 0.48033518 174 acl-2010-Modeling Semantic Relevance for Question-Answer Pairs in Web Social Communities

10 0.45650476 215 acl-2010-Speech-Driven Access to the Deep Web on Mobile Devices

11 0.45009729 178 acl-2010-Non-Cooperation in Dialogue

12 0.44400746 15 acl-2010-A Semi-Supervised Key Phrase Extraction Approach: Learning from Title Phrases through a Document Semantic Network

13 0.42895189 117 acl-2010-Fine-Grained Genre Classification Using Structural Learning Algorithms

14 0.42206129 122 acl-2010-Generating Fine-Grained Reviews of Songs from Album Reviews

15 0.41902098 142 acl-2010-Importance-Driven Turn-Bidding for Spoken Dialogue Systems

16 0.41118723 171 acl-2010-Metadata-Aware Measures for Answer Summarization in Community Question Answering

17 0.40651122 63 acl-2010-Comparable Entity Mining from Comparative Questions

18 0.40053749 79 acl-2010-Cross-Lingual Latent Topic Extraction

19 0.38889036 34 acl-2010-Authorship Attribution Using Probabilistic Context-Free Grammars

20 0.38597396 157 acl-2010-Last but Definitely Not Least: On the Role of the Last Sentence in Automatic Polarity-Classification


similar papers computed by lda model

lda for this paper:

topicId topicWeight

[(9, 0.306), (14, 0.014), (25, 0.036), (39, 0.013), (42, 0.053), (59, 0.066), (72, 0.031), (73, 0.07), (78, 0.04), (80, 0.011), (83, 0.076), (84, 0.036), (97, 0.011), (98, 0.134)]

similar papers list:

simIndex simValue paperId paperTitle

same-paper 1 0.78856832 204 acl-2010-Recommendation in Internet Forums and Blogs

Author: Jia Wang ; Qing Li ; Yuanzhu Peter Chen ; Zhangxi Lin

Abstract: The variety of engaging interactions among users in social medial distinguishes it from traditional Web media. Such a feature should be utilized while attempting to provide intelligent services to social media participants. In this article, we present a framework to recommend relevant information in Internet forums and blogs using user comments, one of the most representative of user behaviors in online discussion. When incorporating user comments, we consider structural, semantic, and authority information carried by them. One of the most important observation from this work is that semantic contents of user comments can play a fairly different role in a different form of social media. When designing a recommendation system for this purpose, such a difference must be considered with caution.

2 0.72608972 117 acl-2010-Fine-Grained Genre Classification Using Structural Learning Algorithms

Author: Zhili Wu ; Katja Markert ; Serge Sharoff

Abstract: Prior use of machine learning in genre classification used a list of labels as classification categories. However, genre classes are often organised into hierarchies, e.g., covering the subgenres of fiction. In this paper we present a method of using the hierarchy of labels to improve the classification accuracy. As a testbed for this approach we use the Brown Corpus as well as a range of other corpora, including the BNC, HGC and Syracuse. The results are not encouraging: apart from the Brown corpus, the improvements of our structural classifier over the flat one are not statistically significant. We discuss the relation between structural learning performance and the visual and distributional balance of the label hierarchy, suggesting that only balanced hierarchies might profit from structural learning.

3 0.71937042 255 acl-2010-Viterbi Training for PCFGs: Hardness Results and Competitiveness of Uniform Initialization

Author: Shay Cohen ; Noah A Smith

Abstract: We consider the search for a maximum likelihood assignment of hidden derivations and grammar weights for a probabilistic context-free grammar, the problem approximately solved by “Viterbi training.” We show that solving and even approximating Viterbi training for PCFGs is NP-hard. We motivate the use of uniformat-random initialization for Viterbi EM as an optimal initializer in absence of further information about the correct model parameters, providing an approximate bound on the log-likelihood.

4 0.53190267 251 acl-2010-Using Anaphora Resolution to Improve Opinion Target Identification in Movie Reviews

Author: Niklas Jakob ; Iryna Gurevych

Abstract: unkown-abstract

5 0.52908343 22 acl-2010-A Unified Graph Model for Sentence-Based Opinion Retrieval

Author: Binyang Li ; Lanjun Zhou ; Shi Feng ; Kam-Fai Wong

Abstract: There is a growing research interest in opinion retrieval as on-line users’ opinions are becoming more and more popular in business, social networks, etc. Practically speaking, the goal of opinion retrieval is to retrieve documents, which entail opinions or comments, relevant to a target subject specified by the user’s query. A fundamental challenge in opinion retrieval is information representation. Existing research focuses on document-based approaches and documents are represented by bag-of-word. However, due to loss of contextual information, this representation fails to capture the associative information between an opinion and its corresponding target. It cannot distinguish different degrees of a sentiment word when associated with different targets. This in turn seriously affects opinion retrieval performance. In this paper, we propose a sentence-based approach based on a new information representa- , tion, namely topic-sentiment word pair, to capture intra-sentence contextual information between an opinion and its target. Additionally, we consider inter-sentence information to capture the relationships among the opinions on the same topic. Finally, the two types of information are combined in a unified graph-based model, which can effectively rank the documents. Compared with existing approaches, experimental results on the COAE08 dataset showed that our graph-based model achieved significant improvement. 1

6 0.52547514 209 acl-2010-Sentiment Learning on Product Reviews via Sentiment Ontology Tree

7 0.52380705 214 acl-2010-Sparsity in Dependency Grammar Induction

8 0.5209868 208 acl-2010-Sentence and Expression Level Annotation of Opinions in User-Generated Discourse

9 0.51596189 174 acl-2010-Modeling Semantic Relevance for Question-Answer Pairs in Web Social Communities

10 0.51485562 140 acl-2010-Identifying Non-Explicit Citing Sentences for Citation-Based Summarization.

11 0.51458758 188 acl-2010-Optimizing Informativeness and Readability for Sentiment Summarization

12 0.51328391 109 acl-2010-Experiments in Graph-Based Semi-Supervised Learning Methods for Class-Instance Acquisition

13 0.51182961 136 acl-2010-How Many Words Is a Picture Worth? Automatic Caption Generation for News Images

14 0.51137745 127 acl-2010-Global Learning of Focused Entailment Graphs

15 0.51098764 113 acl-2010-Extraction and Approximation of Numerical Attributes from the Web

16 0.51009291 245 acl-2010-Understanding the Semantic Structure of Noun Phrase Queries

17 0.50993407 39 acl-2010-Automatic Generation of Story Highlights

18 0.50882399 102 acl-2010-Error Detection for Statistical Machine Translation Using Linguistic Features

19 0.50751454 55 acl-2010-Bootstrapping Semantic Analyzers from Non-Contradictory Texts

20 0.5071665 184 acl-2010-Open-Domain Semantic Role Labeling by Modeling Word Spans