acl acl2013 acl2013-168 knowledge-graph by maker-knowledge-mining

168 acl-2013-Generating Recommendation Dialogs by Extracting Information from User Reviews


Source: pdf

Author: Kevin Reschke ; Adam Vogel ; Dan Jurafsky

Abstract: Recommendation dialog systems help users navigate e-commerce listings by asking questions about users’ preferences toward relevant domain attributes. We present a framework for generating and ranking fine-grained, highly relevant questions from user-generated reviews. We demonstrate ourapproachon anew dataset just released by Yelp, and release a new sentiment lexicon with 1329 adjectives for the restaurant domain.

Reference: text


Summary: the most important sentenses genereted by tfidf model

sentIndex sentText sentNum sentScore

1 We present a framework for generating and ranking fine-grained, highly relevant questions from user-generated reviews. [sent-2, score-0.222]

2 We demonstrate ourapproachon anew dataset just released by Yelp, and release a new sentiment lexicon with 1329 adjectives for the restaurant domain. [sent-3, score-0.451]

3 1 Introduction Recommendation dialog systems have been developed for a number of tasks ranging from product search to restaurant recommendation (Chai et al. [sent-4, score-0.657]

4 These systems learn user requirements through spoken or text-based dialog, asking questions about particular attributes to filter the space of relevant documents. [sent-9, score-0.517]

5 Traditionally, these systems draw questions from a small, fixed set of attributes, such as cuisine or price in the restaurant domain. [sent-10, score-0.36]

6 (2012) show that information extracted from user reviews greatly improves user experience in visual search interfaces. [sent-13, score-0.304]

7 In this paper, we present a dialog-based interface that takes advantage of review texts. [sent-14, score-0.038]

8 We demonstrate our system on a new challenge corpus of 11,537 businesses and 229,907 user reviews released by the popular review website Yelp1 , focusing on the dataset’s 4724 restaurants and bars (164,106 reviews). [sent-15, score-0.566]

9 edu work for generating new, highly-relevant questions from user review texts. [sent-20, score-0.313]

10 The framework makes use of techniques from topic modeling and sentiment-based aspect extraction to identify finegrained attributes for each business. [sent-21, score-0.356]

11 These attributes form the basis of a new set of questions that the system can ask the user. [sent-22, score-0.419]

12 Second, we use a method based on informationgain for dynamically ranking candidate questions during dialog production. [sent-23, score-0.528]

13 This allows our system to select the most informative question at each dialog step. [sent-24, score-0.343]

14 An evaluation based on simulated dialogs shows that both the ranking method and the automatically generated questions improve recall. [sent-25, score-0.336]

15 1 Subcategory Questions Yelp provides each business with category labels for top-level cuisine types like Japanese, Coffee & Tea, and Vegetarian. [sent-27, score-0.202]

16 Many of these top-level categories have natural subcategories (e. [sent-28, score-0.14]

17 By identifying these subcategories, we enable questions which probe one step deeper than the top-level category label. [sent-32, score-0.226]

18 , 2003) on the reviews of each set of businesses in the twenty most common top-level categories, using 10 topics and concatenating all of a business’s reviews into one document. [sent-34, score-0.612]

19 2 Several researchers have used sentence-level documents to model topics in reviews, but these tend to generate topics about finegrained aspects of the sort we discuss in Section 2. [sent-35, score-0.188]

20 We then manually labeled the topics, discarding junk topics and merging similar topics. [sent-37, score-0.041]

21 Using these topic models, we assign a business 2We use the Topic Modeling Toolkit implementation: http://nlp. [sent-39, score-0.13]

22 c A2s0s1o3ci Aatsiosonc fioartio Cno fmorpu Ctoamtiopnuatalt Lioinngauli Lsitnicgsu,i psatgicess 499–504, to a subcategory based on the topic with highest probability in that business’s topic distribution. [sent-43, score-0.19]

23 Finally, we use these subcategory topics to generate questions for our recommender dialog system. [sent-44, score-0.695]

24 Each top-level category corresponds to a single question whose potential answers are the set of subcategories: e. [sent-45, score-0.07]

25 2 Questions from Fine-Grained Aspects Our second source for questions is based on aspect extraction in sentiment summarization (BlairGoldensohn et al. [sent-49, score-0.505]

26 We define an aspect as any noun-phrase which is targeted by a sentiment predicate. [sent-51, score-0.318]

27 For example, from the sentence “The place had great atmosphere, but the service was slow. [sent-52, score-0.142]

28 Second, we apply syntactic patterns to identify NPs targeted by these sentiment predicates. [sent-56, score-0.247]

29 1 Sentiment Lexicon Coordination Graph We generate a list of domain-specific sentiment adjectives using graph propagation. [sent-59, score-0.32]

30 We begin with a seed set combining PARADIGM+ (Jo and Oh, 2011) with ‘strongly subjective’ adjectives from the OpinionFinder lexicon (Wilson et al. [sent-60, score-0.201]

31 Like Brody and Elhadad (2010), we then construct a coordination graph that links adjectives modifying the same noun, but to increase precision we require that the adjectives also be conjoined by and (Hatzivassiloglou and McKeown, 1997). [sent-62, score-0.218]

32 This reduces problems like propagating positive sentiment to orange in good orange chicken. [sent-63, score-0.31]

33 We marked adjectives that follow too or lie in the scope of negation with special prefixes and treated them as distinct lexical entries. [sent-64, score-0.079]

34 Sentiment Propagation Negative and positive seeds are assigned values of 0 and 1 respectively. [sent-65, score-0.028]

35 Then a standard propagation update is computed iteratively (see Eq. [sent-68, score-0.118]

36 In Brody and Elhadad’s implementation of this propagation method, seed sentiment values are fixed, and the update step is repeated until the nonseed values converge. [sent-70, score-0.393]

37 First, we omit candidate nodes that don’t link to at least two positive or two negative seeds. [sent-72, score-0.028]

38 Second, we run the propagation algorithm for fewer iterations (two iterations for negative terms and one for positive terms). [sent-74, score-0.118]

39 We found that additional iterations led to significant error propagation when neutral (italian) or ambiguous (thick) terms were assigned sentiment. [sent-75, score-0.09]

40 This allows us to learn, for example, that the negative seed deca- dent is positive in the restaurant domain. [sent-77, score-0.192]

41 Table 2 shows a sample of sentiment adjectives 3Our results are consistent with the recent finding ofWhitney and Sarkar (2012) that cautious systems are better when bootstrapping from seeds. [sent-78, score-0.291]

42 The final lexicon has 1329 adjectives4, including 853 terms not in the original seed set. [sent-80, score-0.122]

43 5 Evaluative Verbs In addition to this adjective lexicon, we take 56 evaluative verbs such as love and hate from admire-class VerbNet predicates (Kipper-Schuler, 2005). [sent-82, score-0.099]

44 2 Extraction Patterns To identify noun-phrases which are targeted by predicates in our sentiment lexicon, we develop hand-crafted extraction patterns defined over syntactic dependency parses (Blair-Goldensohn et al. [sent-85, score-0.281]

45 Table 3 shows a sample ofthe aspects generated by these methods. [sent-87, score-0.072]

46 Adj + NP It is common practice to extract any NP modified by a sentiment adjective. [sent-88, score-0.212]

47 However, this simple extraction rule suffers from precision problems. [sent-89, score-0.034]

48 First, reviews often contain sentiment toward irrelevant, non-business targets (Wayne is the target of excellent job in (1)). [sent-90, score-0.4]

49 In (2), the extraction +service is clearly wrong–in fact, the opposite sentiment is being expressed. [sent-92, score-0.246]

50 (1) Wayne did an excellent job addressing our needs and giving us our options. [sent-93, score-0.058]

51 (2) Nice and airy atmosphere, but service could be more attentive at times. [sent-94, score-0.045]

52 4We manually removed 26 spurious terms which were caused by parsing errors or propagation to a neutral term. [sent-95, score-0.141]

53 shtml We address these problems by filtering out sentences in hypothetical contexts cued by if, should, could, or a question mark, and by adopting the following, more conservative extractions rules: i) [BIZ + have + adj. [sent-99, score-0.072]

54 + NP] Sentiment adjective modifies NP, main verb is have, subject is business name, it, they, place, or absent. [sent-100, score-0.129]

55 , This place has some really great yogurt and toppings). [sent-103, score-0.097]

56 “Good For” + NP Next, we extract aspects using the pattern BIZ + positive adj. [sent-108, score-0.1]

57 Examples of extracted aspects include +lunch, +large groups, +drinks, and +quick lunch. [sent-110, score-0.072]

58 Verb + NP Finally, we extract NPs that appear as direct object to one of our evaluative verbs (e. [sent-111, score-0.063]

59 3 Aspects as Questions We generate questions from these extracted aspects using simple templates. [sent-116, score-0.26]

60 For example, the aspect +burritos yields the question: Do you want a place with good burritos? [sent-117, score-0.248]

61 3 Question Selection for Dialog To utilize the questions generated from reviews in recommendation dialogs, we first formalize the di- alog optimization task and then offer a solution. [sent-118, score-0.563]

62 hTahveese a astotcriibauteteds are a uctoems,b cinoamtiinogn of Yelp categories and our automatically extracted aspects described in Section 2. [sent-122, score-0.101]

63 Attributes att ∈ Att taaskpee vctasludeess cirni a fdininiteS deoctmioanin2 dom(att). [sent-123, score-0.409]

64 Wsaet td e∈no Attet the subset of businesses with an attribute att taking value val ∈ dom(att), as B|att=val. [sent-124, score-0.811]

65 Attributes are f vuanlucteio vnasl f ∈rom do bmu(saitnte),ss aess Bto| subsets of values: att : B → P(dom(att)). [sent-125, score-0.409]

66 We model a user information →ne Ped( dIo as a tst)et) . [sent-126, score-0.087]

67 The recommendation agent can use both the set of ×× businesses B and the history of question and answers H from the user to select the next query. [sent-134, score-0.903]

68 Thus, formally a recommendation agent is a function π : B H → Att. [sent-135, score-0.429]

69 The dialog ends after a ftiioxend π πnu :m Bbe ×r o Hf queries K. [sent-136, score-0.311]

70 2 Information Gain Agent The information gain recommendation agent chooses questions to ask the user by selecting question attributes that maximize the entropy of the resulting document set, in a manner similar to decision tree learning (Mitchell, 1997). [sent-139, score-1.024]

71 1 Experimental Setup We follow the standard approach of using the attributes of an individual business as a simulation of a user’s preferences (Chung, 2004; Young et al. [sent-141, score-0.31]

72 For every business b ∈ B we form an in2fo0r1m0)at. [sent-143, score-0.093]

73 io Fno nr eeeved composed o bf ∈all Bof wb’es afottrrmibu atnes i:n Ib = [ (att, att(b)) {att∈At[t|att(b)6=∅} To evaluate a recommendation agent, we use the recall metric, which measures how well an information need is satisfied. [sent-144, score-0.276]

74 For each information need I, let BI be the set of businesses that satisfy the questions of an agent. [sent-145, score-0.499]

75 We define the recall of the set of businesses with respect to the information need as recall(BI,I) =Pb∈BIP(att,v|Bal)I∈|I|I1|[val ∈ att(b)] We average recall across all Binf|o|rIm|ation needs, yielding average recall. [sent-146, score-0.373]

76 We compare against a random agent baseline that selects attributes att ∈ Att uniformly at ran- tdhoamt saetl eeactcsh a ttitmribe step. [sent-147, score-0.815]

77 (2010) select questions from a small fixed hierarchy, which is not applicable to our large set of attributes. [sent-149, score-0.188]

78 2 Results Figure 1 shows the average recall for the random agent versus the information gain agent with varying sets of attributes. [sent-151, score-0.456]

79 ‘Top-level’ repeatedly queries the user’s top-level category preferences, ‘Subtopic’ additionally uses our topic modeling subcategories, and ‘All’ uses these plus the aspects extracted from reviews. [sent-152, score-0.147]

80 The ‘Subtopic’ and ‘Top-level’ systems plateau after a few dialog steps once they’ve asked 502 Average Recall by Agent Dialog Length Figure 1: Average recall for each agent. [sent-154, score-0.342]

81 For instance, most businesses only have one or two top-level categories, so after the system has identified the top-level category that the user is interested in, it has no more good questions to ask. [sent-156, score-0.624]

82 Note that the information gain agent starts dialogs with the top-level and appropriate subcategory questions, so it is only for longer dialogs that the fine-grained aspects boost performance. [sent-157, score-0.725]

83 Below we show a few sample output dialogs from our ‘All’ information gain agent. [sent-158, score-0.205]

84 A: American (New) Q: What kind of American (New) do you want: bar, bistro, standard, burgers, brew pub, or brunch? [sent-160, score-0.042]

85 A: bistro Q: Do you want a place with a good patio? [sent-161, score-0.235]

86 A: Chinese Q: What kind of Chinese place do you want: buffet, dim sum, noodles, pan Asian, Panda Express, sit down, or veggie? [sent-163, score-0.174]

87 A: sit down Q: Do you want a place with a good lunch special? [sent-164, score-0.256]

88 A: Mexican Q: What kind of Mexican place do you want: dinner, taqueria, margarita bar, or tortas? [sent-166, score-0.19]

89 A: Margarita bar Q: Do you want a place with a good patio? [sent-167, score-0.235]

90 A: Yes 5 Conclusion We presented a system for extracting large sets of attributes from user reviews and selecting relevant attributes to ask questions about. [sent-168, score-0.85]

91 Using topic models to discover subtypes of businesses, a domain-specific sentiment lexicon, and a number of new techniques for increasing precision in sentiment aspect extraction yields attributes that give a rich representation of the restaurant domain. [sent-169, score-0.847]

92 We have made this 1329-term sentiment lexicon for the restaurant domain available as useful resource to the community. [sent-170, score-0.372]

93 Our information gain recommendation agent gives a principled way to dynamically combine these diverse attributes to ask relevant questions in a coherent dialog. [sent-171, score-0.968]

94 Our approach thus offers a new way to integrate the advantages of the curated hand-build attributes used in statisti- cal slot and filler dialog systems, and the distributionally induced, highly relevant categories built by sentiment aspect extraction systems. [sent-172, score-0.871]

95 Natural language assistant - a dialog system for online product recommendation. [sent-195, score-0.311]

96 Revminer: An extractive interface for navigating reviews on a smartphone. [sent-207, score-0.13]

97 Aspect and sentiment unification model for online review analysis. [sent-211, score-0.25]


similar papers computed by tfidf model

tfidf for this paper:

wordName wordTfidf (topN-words)

[('att', 0.409), ('dialog', 0.311), ('businesses', 0.311), ('recommendation', 0.245), ('sentiment', 0.212), ('questions', 0.188), ('agent', 0.184), ('attributes', 0.18), ('dialogs', 0.148), ('reviews', 0.13), ('subcategory', 0.116), ('subcategories', 0.111), ('restaurant', 0.101), ('place', 0.097), ('brody', 0.097), ('business', 0.093), ('propagation', 0.09), ('user', 0.087), ('infogain', 0.087), ('yelp', 0.087), ('want', 0.08), ('adjectives', 0.079), ('np', 0.078), ('atmosphere', 0.077), ('elhadad', 0.076), ('aspects', 0.072), ('aspect', 0.071), ('cuisine', 0.071), ('dom', 0.068), ('evaluative', 0.063), ('seed', 0.063), ('lexicon', 0.059), ('bar', 0.058), ('val', 0.058), ('bistro', 0.058), ('biz', 0.058), ('burritos', 0.058), ('listings', 0.058), ('patio', 0.058), ('tbt', 0.058), ('vals', 0.058), ('gain', 0.057), ('yes', 0.055), ('somasundaran', 0.052), ('ask', 0.051), ('opinionfinder', 0.051), ('kope', 0.051), ('margarita', 0.051), ('young', 0.051), ('spurious', 0.051), ('jo', 0.048), ('mexican', 0.047), ('service', 0.045), ('lunch', 0.044), ('history', 0.044), ('kind', 0.042), ('selects', 0.042), ('swapna', 0.042), ('topics', 0.041), ('hypothetical', 0.04), ('wayne', 0.04), ('subtopic', 0.04), ('recommender', 0.039), ('chai', 0.039), ('mehmet', 0.039), ('category', 0.038), ('verbnet', 0.038), ('review', 0.038), ('preferences', 0.037), ('topic', 0.037), ('adjective', 0.036), ('targeted', 0.035), ('sit', 0.035), ('orange', 0.035), ('extraction', 0.034), ('thompson', 0.034), ('nps', 0.034), ('finegrained', 0.034), ('relevant', 0.034), ('hatzivassiloglou', 0.033), ('attribute', 0.033), ('question', 0.032), ('janyce', 0.032), ('oh', 0.031), ('coordination', 0.031), ('recall', 0.031), ('excellent', 0.03), ('dynamically', 0.029), ('categories', 0.029), ('graph', 0.029), ('update', 0.028), ('job', 0.028), ('positive', 0.028), ('asking', 0.028), ('wilson', 0.028), ('ba', 0.027), ('jeff', 0.027), ('bridge', 0.027), ('stanford', 0.027), ('wiebe', 0.027)]

similar papers list:

simIndex simValue paperId paperTitle

same-paper 1 0.99999994 168 acl-2013-Generating Recommendation Dialogs by Extracting Information from User Reviews

Author: Kevin Reschke ; Adam Vogel ; Dan Jurafsky

Abstract: Recommendation dialog systems help users navigate e-commerce listings by asking questions about users’ preferences toward relevant domain attributes. We present a framework for generating and ranking fine-grained, highly relevant questions from user-generated reviews. We demonstrate ourapproachon anew dataset just released by Yelp, and release a new sentiment lexicon with 1329 adjectives for the restaurant domain.

2 0.29800847 230 acl-2013-Lightly Supervised Learning of Procedural Dialog Systems

Author: Svitlana Volkova ; Pallavi Choudhury ; Chris Quirk ; Bill Dolan ; Luke Zettlemoyer

Abstract: Procedural dialog systems can help users achieve a wide range of goals. However, such systems are challenging to build, currently requiring manual engineering of substantial domain-specific task knowledge and dialog management strategies. In this paper, we demonstrate that it is possible to learn procedural dialog systems given only light supervision, of the type that can be provided by non-experts. We consider domains where the required task knowledge exists in textual form (e.g., instructional web pages) and where system builders have access to statements of user intent (e.g., search query logs or dialog interactions). To learn from such textual resources, we describe a novel approach that first automatically extracts task knowledge from instructions, then learns a dialog manager over this task knowledge to provide assistance. Evaluation in a Microsoft Office domain shows that the individual components are highly accurate and can be integrated into a dialog system that provides effective help to users.

3 0.22426118 124 acl-2013-Discriminative state tracking for spoken dialog systems

Author: Angeliki Metallinou ; Dan Bohus ; Jason Williams

Abstract: In spoken dialog systems, statistical state tracking aims to improve robustness to speech recognition errors by tracking a posterior distribution over hidden dialog states. Current approaches based on generative or discriminative models have different but important shortcomings that limit their accuracy. In this paper we discuss these limitations and introduce a new approach for discriminative state tracking that overcomes them by leveraging the problem structure. An offline evaluation with dialog data collected from real users shows improvements in both state tracking accuracy and the quality of the posterior probabilities. Features that encode speech recognition error patterns are particularly helpful, and training requires rel- atively few dialogs.

4 0.16936848 2 acl-2013-A Bayesian Model for Joint Unsupervised Induction of Sentiment, Aspect and Discourse Representations

Author: Angeliki Lazaridou ; Ivan Titov ; Caroline Sporleder

Abstract: We propose a joint model for unsupervised induction of sentiment, aspect and discourse information and show that by incorporating a notion of latent discourse relations in the model, we improve the prediction accuracy for aspect and sentiment polarity on the sub-sentential level. We deviate from the traditional view of discourse, as we induce types of discourse relations and associated discourse cues relevant to the considered opinion analysis task; consequently, the induced discourse relations play the role of opinion and aspect shifters. The quantitative analysis that we conducted indicated that the integration of a discourse model increased the prediction accuracy results with respect to the discourse-agnostic approach and the qualitative analysis suggests that the induced representations encode a meaningful discourse structure.

5 0.15657182 188 acl-2013-Identifying Sentiment Words Using an Optimization-based Model without Seed Words

Author: Hongliang Yu ; Zhi-Hong Deng ; Shiyingxue Li

Abstract: Sentiment Word Identification (SWI) is a basic technique in many sentiment analysis applications. Most existing researches exploit seed words, and lead to low robustness. In this paper, we propose a novel optimization-based model for SWI. Unlike previous approaches, our model exploits the sentiment labels of documents instead of seed words. Several experiments on real datasets show that WEED is effective and outperforms the state-of-the-art methods with seed words.

6 0.15381743 211 acl-2013-LABR: A Large Scale Arabic Book Reviews Dataset

7 0.14302967 318 acl-2013-Sentiment Relevance

8 0.12955965 148 acl-2013-Exploring Sentiment in Social Media: Bootstrapping Subjectivity Clues from Multilingual Twitter Streams

9 0.1251125 373 acl-2013-Using Conceptual Class Attributes to Characterize Social Media Users

10 0.11636373 169 acl-2013-Generating Synthetic Comparable Questions for News Articles

11 0.1072759 121 acl-2013-Discovering User Interactions in Ideological Discussions

12 0.10321874 249 acl-2013-Models of Semantic Representation with Visual Attributes

13 0.10307111 379 acl-2013-Utterance-Level Multimodal Sentiment Analysis

14 0.10109579 272 acl-2013-Paraphrase-Driven Learning for Open Question Answering

15 0.099916793 147 acl-2013-Exploiting Topic based Twitter Sentiment for Stock Prediction

16 0.095411025 79 acl-2013-Character-to-Character Sentiment Analysis in Shakespeare's Plays

17 0.091367334 115 acl-2013-Detecting Event-Related Links and Sentiments from Social Media Texts

18 0.089248158 345 acl-2013-The Haves and the Have-Nots: Leveraging Unlabelled Corpora for Sentiment Analysis

19 0.088392958 90 acl-2013-Conditional Random Fields for Responsive Surface Realisation using Global Features

20 0.088088185 81 acl-2013-Co-Regression for Cross-Language Review Rating Prediction


similar papers computed by lsi model

lsi for this paper:

topicId topicWeight

[(0, 0.174), (1, 0.199), (2, -0.028), (3, 0.09), (4, -0.046), (5, -0.05), (6, 0.097), (7, -0.118), (8, 0.049), (9, 0.091), (10, 0.068), (11, -0.01), (12, 0.007), (13, 0.017), (14, 0.066), (15, -0.02), (16, -0.01), (17, 0.037), (18, 0.044), (19, 0.027), (20, -0.134), (21, -0.037), (22, 0.082), (23, 0.055), (24, 0.231), (25, -0.155), (26, 0.038), (27, 0.244), (28, -0.036), (29, -0.122), (30, -0.077), (31, 0.021), (32, 0.131), (33, 0.041), (34, -0.084), (35, 0.018), (36, -0.033), (37, -0.022), (38, 0.098), (39, -0.101), (40, -0.009), (41, -0.007), (42, 0.006), (43, -0.057), (44, -0.051), (45, -0.055), (46, -0.026), (47, 0.029), (48, -0.02), (49, 0.048)]

similar papers list:

simIndex simValue paperId paperTitle

same-paper 1 0.9478851 168 acl-2013-Generating Recommendation Dialogs by Extracting Information from User Reviews

Author: Kevin Reschke ; Adam Vogel ; Dan Jurafsky

Abstract: Recommendation dialog systems help users navigate e-commerce listings by asking questions about users’ preferences toward relevant domain attributes. We present a framework for generating and ranking fine-grained, highly relevant questions from user-generated reviews. We demonstrate ourapproachon anew dataset just released by Yelp, and release a new sentiment lexicon with 1329 adjectives for the restaurant domain.

2 0.82660878 230 acl-2013-Lightly Supervised Learning of Procedural Dialog Systems

Author: Svitlana Volkova ; Pallavi Choudhury ; Chris Quirk ; Bill Dolan ; Luke Zettlemoyer

Abstract: Procedural dialog systems can help users achieve a wide range of goals. However, such systems are challenging to build, currently requiring manual engineering of substantial domain-specific task knowledge and dialog management strategies. In this paper, we demonstrate that it is possible to learn procedural dialog systems given only light supervision, of the type that can be provided by non-experts. We consider domains where the required task knowledge exists in textual form (e.g., instructional web pages) and where system builders have access to statements of user intent (e.g., search query logs or dialog interactions). To learn from such textual resources, we describe a novel approach that first automatically extracts task knowledge from instructions, then learns a dialog manager over this task knowledge to provide assistance. Evaluation in a Microsoft Office domain shows that the individual components are highly accurate and can be integrated into a dialog system that provides effective help to users.

3 0.78556383 124 acl-2013-Discriminative state tracking for spoken dialog systems

Author: Angeliki Metallinou ; Dan Bohus ; Jason Williams

Abstract: In spoken dialog systems, statistical state tracking aims to improve robustness to speech recognition errors by tracking a posterior distribution over hidden dialog states. Current approaches based on generative or discriminative models have different but important shortcomings that limit their accuracy. In this paper we discuss these limitations and introduce a new approach for discriminative state tracking that overcomes them by leveraging the problem structure. An offline evaluation with dialog data collected from real users shows improvements in both state tracking accuracy and the quality of the posterior probabilities. Features that encode speech recognition error patterns are particularly helpful, and training requires rel- atively few dialogs.

4 0.49748924 141 acl-2013-Evaluating a City Exploration Dialogue System with Integrated Question-Answering and Pedestrian Navigation

Author: Srinivasan Janarthanam ; Oliver Lemon ; Phil Bartie ; Tiphaine Dalmas ; Anna Dickinson ; Xingkun Liu ; William Mackaness ; Bonnie Webber

Abstract: We present a city navigation and tourist information mobile dialogue app with integrated question-answering (QA) and geographic information system (GIS) modules that helps pedestrian users to navigate in and learn about urban environments. In contrast to existing mobile apps which treat these problems independently, our Android app addresses the problem of navigation and touristic questionanswering in an integrated fashion using a shared dialogue context. We evaluated our system in comparison with Samsung S-Voice (which interfaces to Google navigation and Google search) with 17 users and found that users judged our system to be significantly more interesting to interact with and learn from. They also rated our system above Google search (with the Samsung S-Voice interface) for tourist information tasks.

5 0.48234278 211 acl-2013-LABR: A Large Scale Arabic Book Reviews Dataset

Author: Mohamed Aly ; Amir Atiya

Abstract: We introduce LABR, the largest sentiment analysis dataset to-date for the Arabic language. It consists of over 63,000 book reviews, each rated on a scale of 1 to 5 stars. We investigate the properties of the the dataset, and present its statistics. We explore using the dataset for two tasks: sentiment polarity classification and rating classification. We provide standard splits of the dataset into training and testing, for both polarity and rating classification, in both balanced and unbalanced settings. We run baseline experiments on the dataset to establish a benchmark.

6 0.47100085 188 acl-2013-Identifying Sentiment Words Using an Optimization-based Model without Seed Words

7 0.46810603 318 acl-2013-Sentiment Relevance

8 0.44575569 117 acl-2013-Detecting Turnarounds in Sentiment Analysis: Thwarting

9 0.4315168 91 acl-2013-Connotation Lexicon: A Dash of Sentiment Beneath the Surface Meaning

10 0.43090293 148 acl-2013-Exploring Sentiment in Social Media: Bootstrapping Subjectivity Clues from Multilingual Twitter Streams

11 0.42059377 373 acl-2013-Using Conceptual Class Attributes to Characterize Social Media Users

12 0.41265485 79 acl-2013-Character-to-Character Sentiment Analysis in Shakespeare's Plays

13 0.40308258 266 acl-2013-PAL: A Chatterbot System for Answering Domain-specific Questions

14 0.3988643 131 acl-2013-Dual Training and Dual Prediction for Polarity Classification

15 0.38013238 2 acl-2013-A Bayesian Model for Joint Unsupervised Induction of Sentiment, Aspect and Discourse Representations

16 0.37412038 81 acl-2013-Co-Regression for Cross-Language Review Rating Prediction

17 0.3705886 90 acl-2013-Conditional Random Fields for Responsive Surface Realisation using Global Features

18 0.36856025 379 acl-2013-Utterance-Level Multimodal Sentiment Analysis

19 0.35894272 176 acl-2013-Grounded Unsupervised Semantic Parsing

20 0.32791349 284 acl-2013-Probabilistic Sense Sentiment Similarity through Hidden Emotions


similar papers computed by lda model

lda for this paper:

topicId topicWeight

[(0, 0.047), (6, 0.034), (11, 0.064), (14, 0.01), (15, 0.018), (24, 0.038), (26, 0.055), (28, 0.015), (35, 0.082), (41, 0.267), (42, 0.048), (48, 0.03), (70, 0.084), (88, 0.042), (90, 0.024), (95, 0.051)]

similar papers list:

simIndex simValue paperId paperTitle

same-paper 1 0.79836637 168 acl-2013-Generating Recommendation Dialogs by Extracting Information from User Reviews

Author: Kevin Reschke ; Adam Vogel ; Dan Jurafsky

Abstract: Recommendation dialog systems help users navigate e-commerce listings by asking questions about users’ preferences toward relevant domain attributes. We present a framework for generating and ranking fine-grained, highly relevant questions from user-generated reviews. We demonstrate ourapproachon anew dataset just released by Yelp, and release a new sentiment lexicon with 1329 adjectives for the restaurant domain.

2 0.67386937 99 acl-2013-Crowd Prefers the Middle Path: A New IAA Metric for Crowdsourcing Reveals Turker Biases in Query Segmentation

Author: Rohan Ramanath ; Monojit Choudhury ; Kalika Bali ; Rishiraj Saha Roy

Abstract: Query segmentation, like text chunking, is the first step towards query understanding. In this study, we explore the effectiveness of crowdsourcing for this task. Through carefully designed control experiments and Inter Annotator Agreement metrics for analysis of experimental data, we show that crowdsourcing may not be a suitable approach for query segmentation because the crowd seems to have a very strong bias towards dividing the query into roughly equal (often only two) parts. Similarly, in the case of hierarchical or nested segmentation, turkers have a strong preference towards balanced binary trees.

3 0.66931409 249 acl-2013-Models of Semantic Representation with Visual Attributes

Author: Carina Silberer ; Vittorio Ferrari ; Mirella Lapata

Abstract: We consider the problem of grounding the meaning of words in the physical world and focus on the visual modality which we represent by visual attributes. We create a new large-scale taxonomy of visual attributes covering more than 500 concepts and their corresponding 688K images. We use this dataset to train attribute classifiers and integrate their predictions with text-based distributional models of word meaning. We show that these bimodal models give a better fit to human word association data compared to amodal models and word representations based on handcrafted norming data.

4 0.54049444 169 acl-2013-Generating Synthetic Comparable Questions for News Articles

Author: Oleg Rokhlenko ; Idan Szpektor

Abstract: We introduce the novel task of automatically generating questions that are relevant to a text but do not appear in it. One motivating example of its application is for increasing user engagement around news articles by suggesting relevant comparable questions, such as “is Beyonce a better singer than Madonna?”, for the user to answer. We present the first algorithm for the task, which consists of: (a) offline construction of a comparable question template database; (b) ranking of relevant templates to a given article; and (c) instantiation of templates only with entities in the article whose comparison under the template’s relation makes sense. We tested the suggestions generated by our algorithm via a Mechanical Turk experiment, which showed a significant improvement over the strongest baseline of more than 45% in all metrics.

5 0.53707254 341 acl-2013-Text Classification based on the Latent Topics of Important Sentences extracted by the PageRank Algorithm

Author: Yukari Ogura ; Ichiro Kobayashi

Abstract: In this paper, we propose a method to raise the accuracy of text classification based on latent topics, reconsidering the techniques necessary for good classification for example, to decide important sentences in a document, the sentences with important words are usually regarded as important sentences. In this case, tf.idf is often used to decide important words. On the other hand, we apply the PageRank algorithm to rank important words in each document. Furthermore, before clustering documents, we refine the target documents by representing them as a collection of important sentences in each document. We then classify the documents based on latent information in the documents. As a clustering method, we employ the k-means algorithm and inves– tigate how our proposed method works for good clustering. We conduct experiments with Reuters-21578 corpus under various conditions of important sentence extraction, using latent and surface information for clustering, and have confirmed that our proposed method provides better result among various conditions for clustering.

6 0.53327048 155 acl-2013-Fast and Accurate Shift-Reduce Constituent Parsing

7 0.53254968 318 acl-2013-Sentiment Relevance

8 0.52912301 272 acl-2013-Paraphrase-Driven Learning for Open Question Answering

9 0.52815568 224 acl-2013-Learning to Extract International Relations from Political Context

10 0.52746534 167 acl-2013-Generalizing Image Captions for Image-Text Parallel Corpus

11 0.52700496 329 acl-2013-Statistical Machine Translation Improves Question Retrieval in Community Question Answering via Matrix Factorization

12 0.52686375 275 acl-2013-Parsing with Compositional Vector Grammars

13 0.52670622 212 acl-2013-Language-Independent Discriminative Parsing of Temporal Expressions

14 0.52660549 153 acl-2013-Extracting Events with Informal Temporal References in Personal Histories in Online Communities

15 0.52630836 80 acl-2013-Chinese Parsing Exploiting Characters

16 0.52552855 274 acl-2013-Parsing Graphs with Hyperedge Replacement Grammars

17 0.52523631 159 acl-2013-Filling Knowledge Base Gaps for Distant Supervision of Relation Extraction

18 0.52498519 343 acl-2013-The Effect of Higher-Order Dependency Features in Discriminative Phrase-Structure Parsing

19 0.5248034 132 acl-2013-Easy-First POS Tagging and Dependency Parsing with Beam Search

20 0.52478278 144 acl-2013-Explicit and Implicit Syntactic Features for Text Classification