emnlp emnlp2013 emnlp2013-173 knowledge-graph by maker-knowledge-mining

173 emnlp-2013-Simulating Early-Termination Search for Verbose Spoken Queries


Source: pdf

Author: Jerome White ; Douglas W. Oard ; Nitendra Rajput ; Marion Zalk

Abstract: Building search engines that can respond to spoken queries with spoken content requires that the system not just be able to find useful responses, but also that it know when it has heard enough about what the user wants to be able to do so. This paper describes a simulation study with queries spoken by non-native speakers that suggests that indicates that finding relevant content is often possible within a half minute, and that combining features based on automatically recognized words with features designed for automated prediction of query difficulty can serve as a useful basis for predicting when that useful content has been found.

Reference: text


Summary: the most important sentenses genereted by tfidf model

sentIndex sentText sentNum sentScore

1 rnit endra @ in ibm com Abstract Building search engines that can respond to spoken queries with spoken content requires that the system not just be able to find useful responses, but also that it know when it has heard enough about what the user wants to be able to do so. [sent-6, score-0.546]

2 1 Introduction Much of the early work on what has come to be called “speech retrieval” has focused on the use of text queries to rank segments that are automatically extracted from spoken content. [sent-8, score-0.362]

3 This raises two challenges: 1) in such settings, both the query and the content must be spoken, and 2) the language being spoken will often be one for which we lack accurate speech recognition. [sent-10, score-0.379]

4 au domain spoken queries for unrestricted spoken content, pose new challenges that call for new thinking about interaction design. [sent-19, score-0.422]

5 This paper explores the potential of a recently proposed alternative, in which the spoken queries are long, and only one response can be played at a time by the system. [sent-20, score-0.33]

6 As with Web searchers, we can expect them to explore initially, then to ultimately settle on query strategies that work well enough to meet their needs. [sent-27, score-0.178]

7 oc d2s0 i1n3 N Aastusorcaila Ltiaon g fuoarg Ceo Pmrpoucetastsi on ga,l p Laignegsu 1is2t7ic0s–1280, study for this paper in which we asked people to babble on some topic for which we already have relevance judgments results. [sent-32, score-0.614]

8 We transcribe those bab- bles using automatic speech recognition (ASR), then note how many words must be babbled in each case before an information retrieval system is first able to place a relevant document in rank one. [sent-33, score-0.285]

9 From this perspective, our results show that people are indeed often able to babble usefully; and, moreover, that current information retrieval technology could often place relevant results at rank one within half a minute or so of babbling even with contemporary speech recognition technology. [sent-34, score-0.8]

10 Barging in with an answer before that point wastes time and disrupts the user; barging in long after that point also wastes time, but also risks user abandonment. [sent-36, score-0.194]

11 Section 6 completes the description of our methods with an explanation of how the stopping classifier is built; Section 7 then presents end-to-end evaluation results using a new measure designed for this task. [sent-42, score-0.287]

12 It is relatively straightforward to collect and store spoken content regardless of the language in which it is spo1271 ken; organizing and searching that content is, however, anything but straightforward. [sent-49, score-0.24]

13 Indeed, the current lack of effective search services is one of the key inhibitors that has, to date, limited spoken forums to experimental settings with at most a few hundred users. [sent-50, score-0.192]

14 1 An alternative would be to adopt more of an “information retrieval” perspective by directly matching words spoken in the query with words that had been spoken in the content to be searched. [sent-53, score-0.499]

15 Some progress has been made on this task in the MediaEval benchmark evaluation, which has included a spoken content matching task each year since 2011 (Metze et al. [sent-54, score-0.228]

16 Our goal in this paper is to begin to explore how such capabilities might be employed in a complete search engine for spoken forum content, as will be evaluated for the first time at MediaEval 2013. [sent-59, score-0.192]

17 2 The principal impediment to development in this first year of that evaluation is the need for relevance judgments, which are not currently available for spoken content of the type we wish to search. [sent-60, score-0.274]

18 org/ medi aeval 2 0 13 / qa 4 sw2 0 13 / Topic 274 Babble position (words) Figure 1: Reciprocal ranks at for each query making up a given babble. [sent-66, score-0.178]

19 When retrieving results, a babbler either “latches” on to a relevant document (Babble 1), moves back-and-forth between relevant documents (Babble 3), or fails to elicit a relevant document at all (Babble 2). [sent-67, score-0.389]

20 Those recognized words, in turn, have been used to rank order the (character-coded written text) news documents that were originally used in TREC, the documents for which we have relevance judgments. [sent-71, score-0.22]

21 Our goal then becomes twofold: to first rank the documents in such a way as to get a relevant document into rank one; and then to recognize when we have done so. [sent-72, score-0.311]

22 For three different babbles prompted by TREC Topic 274, it shows the reciprocal rank for the query that is posed after each additional word is recognized. [sent-74, score-0.879]

23 3 3A reciprocal rank of one indicates that a known relevant document is in position one; a reciprocal rank of 0. [sent-76, score-0.5]

24 1 Acquiring Babbles Ten TREC-5 Ad Hoc topics were selected for this study: 255, 257, 258, 260, 266, 271, 274, 276, 287, and 297 based on our expectation of which of the 50 TREC 5 topics would be most suitable for prompted babbles. [sent-79, score-0.179]

25 For each topic, three babbles were created by people speaking at length about the same information need that the TREC topic reflected. [sent-81, score-0.597]

26 For convenience, the people who created the babbles were second-language speakers of English selected from information technology companies. [sent-82, score-0.521]

27 There were a total of ten babblers; each recorded, in English, babbles for three topics, yielding a total of thirty babbles. [sent-83, score-0.553]

28 We maintained a balance across topics when assigning topic that the most highly ranked known relevant document is in position two; 0. [sent-84, score-0.292]

29 The system prompted the user for a three digit topic ID. [sent-93, score-0.167]

30 After obtaining the topic ID, the system then prompted the user to start speaking about what they were looking for. [sent-94, score-0.2]

31 188 Table 2: Average ASR Word Error Rate over 3 babbles per topic (SD=Standard Deviation). [sent-124, score-0.564]

32 The ASR transcripts of the babbles were used by our system as a basis for ranking, and as a basis for making the decision on when to barge-in, what we call the “stopping point. [sent-126, score-0.591]

33 Each babble was turned into a set of nested queries by sequentially concatenating words. [sent-148, score-0.556]

34 Specifically, the first query contained only the first word from the babble, the second query only the first two words, and so on. [sent-149, score-0.272]

35 Thus, the number of queries presented to Indri for a given babble was equivalent to the number of words in the babble, with each query differing only by the number of words it contained. [sent-150, score-0.692]

36 For evaluation, we were interested in the reciprocal rank; in particular, where the reciprocal rank was one. [sent-153, score-0.231]

37 This measure tells us when Indri was able to place a known relevant document at rank one. [sent-154, score-0.227]

38 While this may seem low, it is in line with observations from other spoken content retrieval research: over classroom lectures (Chelba et al. [sent-159, score-0.244]

39 It includes only the subset of babbles for which, during the babble, at least one known relevant document was found at the top of the ranked list. [sent-166, score-0.673]

40 The table presents the number of recognized words—a proxy for the number of potential stopping points—and at how many of those potential stopping points the document ranked in position 1 is known to be relevant, known not to be relevant, or of unknown relevance. [sent-167, score-0.693]

41 Because of the way in which TREC relevance judgments were created, unknown relevance indicates that no TREC system returned the document near the top of their ranked list. [sent-168, score-0.24]

42 Table 3 also shows how much we would need to rely on that assumption: the “scorable” fraction for which the relevance of the top-ranked document is known, rather than assumed, ranges from 93 per cent down to 5 per cent. [sent-170, score-0.209]

43 In the averages that we report below, we omit the five babbles with scorable fractions of 30 per cent or less. [sent-171, score-0.777]

44 On average, over the 10 topics for which more than 30 per cent of the potential stopping points are scorable, there are 37 stopping points at which our system could have been scored as successful based on a known relevant document in position 1. [sent-172, score-0.89]

45 In three of these cases, the challenge for our stopping classifier is extreme, with only a handful—between two and seven—of such opportunities. [sent-173, score-0.287]

46 Table 3 next presents the word positions at which known relevant documents first and last appear in rank one (“First Rel”). [sent-176, score-0.209]

47 This are the earliest and latest scorable successful stopping points. [sent-177, score-0.394]

48 As can be seen, the first possible stopping point exhibits considerable variation, as does the last. [sent-178, score-0.26]

49 For some babbles—babble 274-3, for example—almost any choice of stopping points would be fine. [sent-179, score-0.265]

50 In other cases—babble 258-1, for example—a stopping point prediction would need to be spot on to get any useful results at all. [sent-180, score-0.291]

51 Moreover, we can see both cases in different babbles for the same topic despite the fact that both babblers were prompted by the same topic; for example, babbles 257-1 and 257-3, which are, respectively, fairly easy and fairly hard. [sent-181, score-1.233]

52 The rightmost column of Table 3 shows the measured WER for each scorable babble. [sent-183, score-0.174]

53 Of the 10 scorable babbles for which more than 30 per cent of the potential stopping points are scorable, three turned out to be extremely challenging for ASR, with word er- ror rates above 0. [sent-184, score-1.071]

54 span 1275 the 10 babbles on which we focus is 0. [sent-187, score-0.521]

55 In addition to the 15 babbles shown in Table 3, there are another 15 babbles for which no relevant document was retrievable. [sent-189, score-1.167]

56 Of those, only a single babble—babble 255-2, at 54 per cent scorable and a WER of 0. [sent-190, score-0.256]

57 402—had more than 30 per cent of the potential stopping points scorable. [sent-191, score-0.376]

58 Our stopping pre- diction models uses four types of features for each potential stopping point: the number of words spoken so far, the average word length so far, some “surface characteristics” of those words, and some query performance prediction metrics. [sent-194, score-0.798]

59 As post-retrieval query difficulty prediction measures, we choose three that have been prominent in information retrieval research: clarity (Cronen- Townsend et al. [sent-199, score-0.21]

60 , 2002), weighted information gain (Zhou and Croft, 2007), and normalized query commitment (Shtok et al. [sent-200, score-0.163]

61 Although each takes a distinct approach, the methods all compare some aspect of the documents retrieved by a query Topic 274, Babble 1 RcalknrpRcieoa 0 0 . [sent-202, score-0.215]

62 624180 0 20 40 60 80 10 Babble position (words) True positive True negative False negative False positive Figure 2: Predictions for babble 274-1 made by a decision tree classifier trained on 27 babbles for the nine other topics. [sent-204, score-1.228]

63 For each point, the mean reciprocal rank is annotated to indicate the correctness of the guess made by the classifier. [sent-205, score-0.402]

64 They seek to provide some measure of information about how likely a query is to have ranked the documents well when relevance judgments are not available. [sent-217, score-0.285]

65 Weighted information gain and normalized query commitment look at the scores of the retrieved documents, the former comparing the mean score of the retrieved set with that of the entire corpus; the latter measuring the standard deviation of the scores for the retrieved set. [sent-219, score-0.292]

66 A separate classifier was then trained for each topic by creating a binary objective function for all 27 babbles for the nine other 1276 topics, then using every query for every one of those babbles as training instances. [sent-221, score-1.288]

67 The objective function produces 1 if the query actually retrieved a relevant document at first rank, and 0 otherwise. [sent-222, score-0.304]

68 As can be seen, the decision tree classifier seems to be a good choice, so in Section 7 we compare the stopping prediction model based on a decision tree classifier trained using hold-onetopic-out cross-validation with three baseline models. [sent-224, score-0.611]

69 1 Evaluation Measure To evaluate a stopping prediction model, the fundamental goal is to stop with a relevant document in rank one, and to do so as close in time as possible to the first such opportunity. [sent-227, score-0.451]

70 If the first guess is bad, it would be reasonable to score a second guess, with some penalty. [sent-228, score-0.222]

71 Let q0 be the first point within a query where the reciprocal rank is one. [sent-240, score-0.329]

72 Let pi be the first “yes” guess of the predictor after point q0. [sent-241, score-0.262]

73 From Figure 1, some cases the potential stopping points are consecutive, while in others they are intermittent— we penalize delays from the first good opportunity even when there is no relevant document in position one because we feel that best models the user experience. [sent-246, score-0.592]

74 The deterministic baseline made its first guess at a calculated point in the babble, and continued to guess at each word thereafter. [sent-250, score-0.543]

75 The initial guess was determined by taking the average of the first scorable point of the other 27 out-oftopic babbles. [sent-251, score-0.436]

76 The random baseline drew the first and second words at which to guess “yes” as samples from a 1277 First Opportunity Figure 3: First guesses for various classifiers plotted against the first instance of rank one documents within a babble. [sent-252, score-0.45]

77 Points below the diagonal are places where the classifier guessed too early; points above are guesses too late. [sent-253, score-0.2]

78 All 11babbles for which the decision tree classifier made a guess are shown. [sent-254, score-0.429]

79 7 Results Figure 3 shows the extent to which each classifiers first guess is early, on time, or late. [sent-257, score-0.222]

80 5; for late guesses the penalty depends on how late the guess is. [sent-261, score-0.364]

81 As can be seen, our decision tree classifier (“trees”) guesses early more often than it guesses late. [sent-262, score-0.383]

82 For an additional four cases (not plotted), the decision tree classifier never makes a guess. [sent-263, score-0.18]

83 These results are averaged over all eleven babbles for which the decision tree classifier made at least one guess; no guess was made on babbles 257-3, 266-2, 260-3, or 274-3. [sent-265, score-1.531]

84 These re- Figure 4: Evaluation using all available babbles in which the tree classifier made a guess. [sent-266, score-0.658]

85 The leftmost point in each figure, plotted at a “window size” of one, shows the results for the stopping prediction models as we have described them. [sent-268, score-0.32]

86 It is possible, and indeed not unusual, for our decision tree classifier to make two or three guesses in a row, however, in part because it has no feature telling it how long is has been since its most recent guess. [sent-269, score-0.268]

87 To see whether adding a bit of patience would help, we added a deterministic period following each guess in which no additional guess would be allowed. [sent-270, score-0.476]

88 We call the point at which this delay expires, and a guess is again allowed, the delay “window. [sent-271, score-0.328]

89 ” As can be seen, a window size of ten or eleven— allowing the next guess no sooner than the tenth or eleventh subsequent word—is optimal for the decision tree classifier when averaged over these eleven babbles. [sent-272, score-0.435]

90 Although it has fewer features available to it— knowing only the mean number of words to the first opportunity for other topics—it is able to outperform the decision tree classifier for relatively large window sizes. [sent-275, score-0.224]

91 From this analysis we conclude that our decision 1278 tree classifier shows promise; and that going forward, it would likely be beneficial to integrate features of the deterministic classifier. [sent-276, score-0.212]

92 Moreover, we need some approach to accommodate the four cases in which the decision tree classifier never guesses. [sent-278, score-0.18]

93 Setting a maximum point at which the first guess will be tried could be a useful initial heuristic, and one that would be reasonable to apply in practice. [sent-279, score-0.262]

94 8 Conclusions and Future Work We have used a simulation study to show that building a system for query by babbling is feasible. [sent-280, score-0.27]

95 Moreover, we have suggested a reasonable evaluation measure for this task, and we have shown that several simple baselines for predicting stopping points can be beaten by a decision tree classifier. [sent-281, score-0.378]

96 Our next step is to try these same techniques with spoken questions and spoken answers in a low-resource language using the test collection that is being developed for the MediaEval 2013 Question Answering for the Spoken Web task. [sent-282, score-0.324]

97 There has been some work on techniques for recognizing useful query terms in long queries, but of course we will need to do that with spoken queries, and moreover with queries spoken in a language for which we have at lest limited speech processing capabilities available. [sent-285, score-0.6]

98 Acknowledgments We thank Anna Shtok for her assistance with the understanding and implementation ofthe various query prediction metrics. [sent-288, score-0.167]

99 Soft indexing of speech content for search in spoken documents. [sent-306, score-0.273]

100 Derivation of new readability formulas (automated readability index, fog count and flesch reading ease formula) for navy enlisted personnel. [sent-335, score-0.236]


similar papers computed by tfidf model

tfidf for this paper:

wordName wordTfidf (topN-words)

[('babbles', 0.521), ('babble', 0.458), ('guess', 0.222), ('stopping', 0.22), ('scorable', 0.174), ('spoken', 0.162), ('trec', 0.136), ('query', 0.136), ('electric', 0.124), ('queries', 0.098), ('wer', 0.088), ('guesses', 0.088), ('readability', 0.084), ('asr', 0.084), ('cent', 0.082), ('vehicles', 0.082), ('babblers', 0.079), ('babbling', 0.079), ('mediaeval', 0.079), ('reciprocal', 0.078), ('rank', 0.075), ('relevance', 0.073), ('relevant', 0.071), ('decision', 0.07), ('prompted', 0.069), ('classifier', 0.067), ('indri', 0.063), ('shtok', 0.063), ('simulation', 0.055), ('topics', 0.055), ('user', 0.055), ('document', 0.054), ('coleman', 0.047), ('consumption', 0.047), ('interrupt', 0.047), ('kincaid', 0.047), ('mamou', 0.047), ('medhi', 0.047), ('mudliar', 0.047), ('nitendra', 0.047), ('senter', 0.047), ('sherwani', 0.047), ('points', 0.045), ('credit', 0.045), ('opportunity', 0.044), ('retrieval', 0.043), ('car', 0.043), ('tree', 0.043), ('retrieved', 0.043), ('topic', 0.043), ('speech', 0.042), ('ultimately', 0.042), ('position', 0.042), ('response', 0.041), ('metze', 0.041), ('strohman', 0.041), ('chelba', 0.041), ('chia', 0.041), ('flesch', 0.041), ('users', 0.041), ('judgments', 0.04), ('point', 0.04), ('content', 0.039), ('rel', 0.038), ('sigir', 0.038), ('documents', 0.036), ('oard', 0.035), ('eleven', 0.033), ('delay', 0.033), ('speaking', 0.033), ('deterministic', 0.032), ('babbler', 0.032), ('barged', 0.032), ('barging', 0.032), ('delays', 0.032), ('dtic', 0.032), ('krovetz', 0.032), ('manufacturers', 0.032), ('minute', 0.032), ('petrol', 0.032), ('rajput', 0.032), ('smog', 0.032), ('thirty', 0.032), ('responses', 0.031), ('prediction', 0.031), ('search', 0.03), ('agarwal', 0.03), ('mobile', 0.029), ('plotted', 0.029), ('potential', 0.029), ('commitment', 0.027), ('lobby', 0.027), ('carmel', 0.027), ('fog', 0.027), ('gunning', 0.027), ('known', 0.027), ('early', 0.027), ('answer', 0.027), ('made', 0.027), ('late', 0.027)]

similar papers list:

simIndex simValue paperId paperTitle

same-paper 1 0.99999958 173 emnlp-2013-Simulating Early-Termination Search for Verbose Spoken Queries

Author: Jerome White ; Douglas W. Oard ; Nitendra Rajput ; Marion Zalk

Abstract: Building search engines that can respond to spoken queries with spoken content requires that the system not just be able to find useful responses, but also that it know when it has heard enough about what the user wants to be able to do so. This paper describes a simulation study with queries spoken by non-native speakers that suggests that indicates that finding relevant content is often possible within a half minute, and that combining features based on automatically recognized words with features designed for automated prediction of query difficulty can serve as a useful basis for predicting when that useful content has been found.

2 0.14341544 97 emnlp-2013-Identifying Web Search Query Reformulation using Concept based Matching

Author: Ahmed Hassan

Abstract: Web search users frequently modify their queries in hope of receiving better results. This process is referred to as “Query Reformulation”. Previous research has mainly focused on proposing query reformulations in the form of suggested queries for users. Some research has studied the problem of predicting whether the current query is a reformulation of the previous query or not. However, this work has been limited to bag-of-words models where the main signals being used are word overlap, character level edit distance and word level edit distance. In this work, we show that relying solely on surface level text similarity results in many false positives where queries with different intents yet similar topics are mistakenly predicted as query reformulations. We propose a new representation for Web search queries based on identifying the concepts in queries and show that we can sig- nificantly improve query reformulation performance using features of query concepts.

3 0.09148483 105 emnlp-2013-Improving Web Search Ranking by Incorporating Structured Annotation of Queries

Author: Xiao Ding ; Zhicheng Dou ; Bing Qin ; Ting Liu ; Ji-rong Wen

Abstract: Web users are increasingly looking for structured data, such as lyrics, job, or recipes, using unstructured queries on the web. However, retrieving relevant results from such data is a challenging problem due to the unstructured language of the web queries. In this paper, we propose a method to improve web search ranking by detecting Structured Annotation of queries based on top search results. In a structured annotation, the original query is split into different units that are associated with semantic attributes in the corresponding domain. We evaluate our techniques using real world queries and achieve significant improvement. . 1

4 0.080735572 39 emnlp-2013-Boosting Cross-Language Retrieval by Learning Bilingual Phrase Associations from Relevance Rankings

Author: Artem Sokokov ; Laura Jehl ; Felix Hieber ; Stefan Riezler

Abstract: We present an approach to learning bilingual n-gram correspondences from relevance rankings of English documents for Japanese queries. We show that directly optimizing cross-lingual rankings rivals and complements machine translation-based cross-language information retrieval (CLIR). We propose an efficient boosting algorithm that deals with very large cross-product spaces of word correspondences. We show in an experimental evaluation on patent prior art search that our approach, and in particular a consensus-based combination of boosting and translation-based approaches, yields substantial improvements in CLIR performance. Our training and test data are made publicly available.

5 0.076750249 180 emnlp-2013-The Answer is at your Fingertips: Improving Passage Retrieval for Web Question Answering with Search Behavior Data

Author: Mikhail Ageev ; Dmitry Lagun ; Eugene Agichtein

Abstract: Passage retrieval is a crucial first step of automatic Question Answering (QA). While existing passage retrieval algorithms are effective at selecting document passages most similar to the question, or those that contain the expected answer types, they do not take into account which parts of the document the searchers actually found useful. We propose, to the best of our knowledge, the first successful attempt to incorporate searcher examination data into passage retrieval for question answering. Specifically, we exploit detailed examination data, such as mouse cursor movements and scrolling, to infer the parts of the document the searcher found interesting, and then incorporate this signal into passage retrieval for QA. Our extensive experiments and analysis demonstrate that our method significantly improves passage retrieval, compared to using textual features alone. As an additional contribution, we make available to the research community the code and the search behavior data used in this study, with the hope of encouraging further research in this area.

6 0.07341513 4 emnlp-2013-A Dataset for Research on Short-Text Conversations

7 0.069625936 31 emnlp-2013-Automatic Feature Engineering for Answer Selection and Extraction

8 0.062546179 24 emnlp-2013-Application of Localized Similarity for Web Documents

9 0.057453051 126 emnlp-2013-MCTest: A Challenge Dataset for the Open-Domain Machine Comprehension of Text

10 0.054898471 148 emnlp-2013-Orthonormal Explicit Topic Analysis for Cross-Lingual Document Matching

11 0.053564396 7 emnlp-2013-A Hierarchical Entity-Based Approach to Structuralize User Generated Content in Social Media: A Case of Yahoo! Answers

12 0.052269332 95 emnlp-2013-Identifying Multiple Userids of the Same Author

13 0.047010399 178 emnlp-2013-Success with Style: Using Writing Style to Predict the Success of Novels

14 0.045229331 89 emnlp-2013-Gender Inference of Twitter Users in Non-English Contexts

15 0.044637851 121 emnlp-2013-Learning Topics and Positions from Debatepedia

16 0.042500876 16 emnlp-2013-A Unified Model for Topics, Events and Users on Twitter

17 0.040584505 69 emnlp-2013-Efficient Collective Entity Linking with Stacking

18 0.040412799 179 emnlp-2013-Summarizing Complex Events: a Cross-Modal Solution of Storylines Extraction and Reconstruction

19 0.040082615 135 emnlp-2013-Monolingual Marginal Matching for Translation Model Adaptation

20 0.03960187 6 emnlp-2013-A Generative Joint, Additive, Sequential Model of Topics and Speech Acts in Patient-Doctor Communication


similar papers computed by lsi model

lsi for this paper:

topicId topicWeight

[(0, -0.148), (1, 0.043), (2, -0.058), (3, 0.018), (4, -0.021), (5, 0.007), (6, 0.074), (7, 0.133), (8, 0.072), (9, -0.098), (10, -0.094), (11, 0.143), (12, -0.107), (13, -0.073), (14, 0.11), (15, 0.029), (16, -0.03), (17, -0.087), (18, 0.065), (19, 0.028), (20, 0.079), (21, -0.058), (22, -0.031), (23, 0.057), (24, 0.0), (25, -0.002), (26, -0.035), (27, -0.032), (28, 0.033), (29, 0.052), (30, -0.005), (31, -0.074), (32, -0.082), (33, 0.067), (34, -0.028), (35, -0.041), (36, -0.045), (37, 0.036), (38, -0.004), (39, 0.002), (40, -0.079), (41, -0.006), (42, 0.026), (43, 0.028), (44, -0.016), (45, 0.102), (46, -0.003), (47, 0.063), (48, 0.043), (49, 0.108)]

similar papers list:

simIndex simValue paperId paperTitle

same-paper 1 0.94523221 173 emnlp-2013-Simulating Early-Termination Search for Verbose Spoken Queries

Author: Jerome White ; Douglas W. Oard ; Nitendra Rajput ; Marion Zalk

Abstract: Building search engines that can respond to spoken queries with spoken content requires that the system not just be able to find useful responses, but also that it know when it has heard enough about what the user wants to be able to do so. This paper describes a simulation study with queries spoken by non-native speakers that suggests that indicates that finding relevant content is often possible within a half minute, and that combining features based on automatically recognized words with features designed for automated prediction of query difficulty can serve as a useful basis for predicting when that useful content has been found.

2 0.7521103 97 emnlp-2013-Identifying Web Search Query Reformulation using Concept based Matching

Author: Ahmed Hassan

Abstract: Web search users frequently modify their queries in hope of receiving better results. This process is referred to as “Query Reformulation”. Previous research has mainly focused on proposing query reformulations in the form of suggested queries for users. Some research has studied the problem of predicting whether the current query is a reformulation of the previous query or not. However, this work has been limited to bag-of-words models where the main signals being used are word overlap, character level edit distance and word level edit distance. In this work, we show that relying solely on surface level text similarity results in many false positives where queries with different intents yet similar topics are mistakenly predicted as query reformulations. We propose a new representation for Web search queries based on identifying the concepts in queries and show that we can sig- nificantly improve query reformulation performance using features of query concepts.

3 0.6621151 180 emnlp-2013-The Answer is at your Fingertips: Improving Passage Retrieval for Web Question Answering with Search Behavior Data

Author: Mikhail Ageev ; Dmitry Lagun ; Eugene Agichtein

Abstract: Passage retrieval is a crucial first step of automatic Question Answering (QA). While existing passage retrieval algorithms are effective at selecting document passages most similar to the question, or those that contain the expected answer types, they do not take into account which parts of the document the searchers actually found useful. We propose, to the best of our knowledge, the first successful attempt to incorporate searcher examination data into passage retrieval for question answering. Specifically, we exploit detailed examination data, such as mouse cursor movements and scrolling, to infer the parts of the document the searcher found interesting, and then incorporate this signal into passage retrieval for QA. Our extensive experiments and analysis demonstrate that our method significantly improves passage retrieval, compared to using textual features alone. As an additional contribution, we make available to the research community the code and the search behavior data used in this study, with the hope of encouraging further research in this area.

4 0.63240355 105 emnlp-2013-Improving Web Search Ranking by Incorporating Structured Annotation of Queries

Author: Xiao Ding ; Zhicheng Dou ; Bing Qin ; Ting Liu ; Ji-rong Wen

Abstract: Web users are increasingly looking for structured data, such as lyrics, job, or recipes, using unstructured queries on the web. However, retrieving relevant results from such data is a challenging problem due to the unstructured language of the web queries. In this paper, we propose a method to improve web search ranking by detecting Structured Annotation of queries based on top search results. In a structured annotation, the original query is split into different units that are associated with semantic attributes in the corresponding domain. We evaluate our techniques using real world queries and achieve significant improvement. . 1

5 0.59835595 39 emnlp-2013-Boosting Cross-Language Retrieval by Learning Bilingual Phrase Associations from Relevance Rankings

Author: Artem Sokokov ; Laura Jehl ; Felix Hieber ; Stefan Riezler

Abstract: We present an approach to learning bilingual n-gram correspondences from relevance rankings of English documents for Japanese queries. We show that directly optimizing cross-lingual rankings rivals and complements machine translation-based cross-language information retrieval (CLIR). We propose an efficient boosting algorithm that deals with very large cross-product spaces of word correspondences. We show in an experimental evaluation on patent prior art search that our approach, and in particular a consensus-based combination of boosting and translation-based approaches, yields substantial improvements in CLIR performance. Our training and test data are made publicly available.

6 0.50843656 95 emnlp-2013-Identifying Multiple Userids of the Same Author

7 0.46810001 24 emnlp-2013-Application of Localized Similarity for Web Documents

8 0.43975872 200 emnlp-2013-Well-Argued Recommendation: Adaptive Models Based on Words in Recommender Systems

9 0.40359902 203 emnlp-2013-With Blinkers on: Robust Prediction of Eye Movements across Readers

10 0.3996827 129 emnlp-2013-Measuring Ideological Proportions in Political Speeches

11 0.3932341 7 emnlp-2013-A Hierarchical Entity-Based Approach to Structuralize User Generated Content in Social Media: A Case of Yahoo! Answers

12 0.38047177 155 emnlp-2013-Question Difficulty Estimation in Community Question Answering Services

13 0.37338832 121 emnlp-2013-Learning Topics and Positions from Debatepedia

14 0.36973396 153 emnlp-2013-Predicting the Resolution of Referring Expressions from User Behavior

15 0.34732321 142 emnlp-2013-Open-Domain Fine-Grained Class Extraction from Web Search Queries

16 0.34661055 148 emnlp-2013-Orthonormal Explicit Topic Analysis for Cross-Lingual Document Matching

17 0.33426929 131 emnlp-2013-Mining New Business Opportunities: Identifying Trend related Products by Leveraging Commercial Intents from Microblogs

18 0.33082837 100 emnlp-2013-Improvements to the Bayesian Topic N-Gram Models

19 0.32854733 133 emnlp-2013-Modeling Scientific Impact with Topical Influence Regression

20 0.32781526 199 emnlp-2013-Using Topic Modeling to Improve Prediction of Neuroticism and Depression in College Students


similar papers computed by lda model

lda for this paper:

topicId topicWeight

[(3, 0.045), (18, 0.019), (20, 0.296), (22, 0.035), (29, 0.011), (30, 0.094), (45, 0.017), (47, 0.01), (50, 0.019), (51, 0.208), (66, 0.028), (71, 0.048), (75, 0.032), (90, 0.011), (96, 0.022)]

similar papers list:

simIndex simValue paperId paperTitle

same-paper 1 0.78234315 173 emnlp-2013-Simulating Early-Termination Search for Verbose Spoken Queries

Author: Jerome White ; Douglas W. Oard ; Nitendra Rajput ; Marion Zalk

Abstract: Building search engines that can respond to spoken queries with spoken content requires that the system not just be able to find useful responses, but also that it know when it has heard enough about what the user wants to be able to do so. This paper describes a simulation study with queries spoken by non-native speakers that suggests that indicates that finding relevant content is often possible within a half minute, and that combining features based on automatically recognized words with features designed for automated prediction of query difficulty can serve as a useful basis for predicting when that useful content has been found.

2 0.76425439 107 emnlp-2013-Interactive Machine Translation using Hierarchical Translation Models

Author: Jesus Gonzalez-Rubio ; Daniel Ortiz-Martinez ; Jose-Miguel Benedi ; Francisco Casacuberta

Abstract: Current automatic machine translation systems are not able to generate error-free translations and human intervention is often required to correct their output. Alternatively, an interactive framework that integrates the human knowledge into the translation process has been presented in previous works. Here, we describe a new interactive machine translation approach that is able to work with phrase-based and hierarchical translation models, and integrates error-correction all in a unified statistical framework. In our experiments, our approach outperforms previous interactive translation systems, and achieves estimated effort reductions of as much as 48% relative over a traditional post-edition system.

3 0.70673186 48 emnlp-2013-Collective Personal Profile Summarization with Social Networks

Author: Zhongqing Wang ; Shoushan LI ; Fang Kong ; Guodong Zhou

Abstract: Personal profile information on social media like LinkedIn.com and Facebook.com is at the core of many interesting applications, such as talent recommendation and contextual advertising. However, personal profiles usually lack organization confronted with the large amount of available information. Therefore, it is always a challenge for people to find desired information from them. In this paper, we address the task of personal profile summarization by leveraging both personal profile textual information and social networks. Here, using social networks is motivated by the intuition that, people with similar academic, business or social connections (e.g. co-major, co-university, and cocorporation) tend to have similar experience and summaries. To achieve the learning process, we propose a collective factor graph (CoFG) model to incorporate all these resources of knowledge to summarize personal profiles with local textual attribute functions and social connection factors. Extensive evaluation on a large-scale dataset from LinkedIn.com demonstrates the effectiveness of the proposed approach. 1

4 0.62292391 132 emnlp-2013-Mining Scientific Terms and their Definitions: A Study of the ACL Anthology

Author: Yiping Jin ; Min-Yen Kan ; Jun-Ping Ng ; Xiangnan He

Abstract: This paper presents DefMiner, a supervised sequence labeling system that identifies scientific terms and their accompanying definitions. DefMiner achieves 85% F1 on a Wikipedia benchmark corpus, significantly improving the previous state-of-the-art by 8%. We exploit DefMiner to process the ACL Anthology Reference Corpus (ARC) – a large, real-world digital library of scientific articles in computational linguistics. The resulting automatically-acquired glossary represents the terminology defined over several thousand individual research articles. We highlight several interesting observations: more definitions are introduced for conference and workshop papers over the years and that multiword terms account for slightly less than half of all terms. Obtaining a list of popular , defined terms in a corpus ofcomputational linguistics papers, we find that concepts can often be categorized into one of three categories: resources, methodologies and evaluation metrics.

5 0.61454248 53 emnlp-2013-Cross-Lingual Discriminative Learning of Sequence Models with Posterior Regularization

Author: Kuzman Ganchev ; Dipanjan Das

Abstract: We present a framework for cross-lingual transfer of sequence information from a resource-rich source language to a resourceimpoverished target language that incorporates soft constraints via posterior regularization. To this end, we use automatically word aligned bitext between the source and target language pair, and learn a discriminative conditional random field model on the target side. Our posterior regularization constraints are derived from simple intuitions about the task at hand and from cross-lingual alignment information. We show improvements over strong baselines for two tasks: part-of-speech tagging and namedentity segmentation.

6 0.61440593 56 emnlp-2013-Deep Learning for Chinese Word Segmentation and POS Tagging

7 0.61396283 69 emnlp-2013-Efficient Collective Entity Linking with Stacking

8 0.61361617 167 emnlp-2013-Semi-Markov Phrase-Based Monolingual Alignment

9 0.61310321 143 emnlp-2013-Open Domain Targeted Sentiment

10 0.61157233 154 emnlp-2013-Prior Disambiguation of Word Tensors for Constructing Sentence Vectors

11 0.61089236 82 emnlp-2013-Exploring Representations from Unlabeled Data with Co-training for Chinese Word Segmentation

12 0.61075169 79 emnlp-2013-Exploiting Multiple Sources for Open-Domain Hypernym Discovery

13 0.6103062 140 emnlp-2013-Of Words, Eyes and Brains: Correlating Image-Based Distributional Semantic Models with Neural Representations of Concepts

14 0.60971224 152 emnlp-2013-Predicting the Presence of Discourse Connectives

15 0.60947174 47 emnlp-2013-Collective Opinion Target Extraction in Chinese Microblogs

16 0.60909712 114 emnlp-2013-Joint Learning and Inference for Grammatical Error Correction

17 0.60908556 13 emnlp-2013-A Study on Bootstrapping Bilingual Vector Spaces from Non-Parallel Data (and Nothing Else)

18 0.60905212 36 emnlp-2013-Automatically Determining a Proper Length for Multi-Document Summarization: A Bayesian Nonparametric Approach

19 0.60888034 106 emnlp-2013-Inducing Document Plans for Concept-to-Text Generation

20 0.60870266 39 emnlp-2013-Boosting Cross-Language Retrieval by Learning Bilingual Phrase Associations from Relevance Rankings