emnlp emnlp2010 emnlp2010-53 knowledge-graph by maker-knowledge-mining

53 emnlp-2010-Fusing Eye Gaze with Speech Recognition Hypotheses to Resolve Exophoric References in Situated Dialogue


Source: pdf

Author: Zahar Prasov ; Joyce Y. Chai

Abstract: In situated dialogue humans often utter linguistic expressions that refer to extralinguistic entities in the environment. Correctly resolving these references is critical yet challenging for artificial agents partly due to their limited speech recognition and language understanding capabilities. Motivated by psycholinguistic studies demonstrating a tight link between language production and human eye gaze, we have developed approaches that integrate naturally occurring human eye gaze with speech recognition hypotheses to resolve exophoric references in situated dialogue in a virtual world. In addition to incorporating eye gaze with the best recognized spoken hypothesis, we developed an algorithm to also handle multiple hypotheses modeled as word confusion networks. Our empirical results demonstrate that incorporating eye gaze with recognition hypotheses consistently outperforms the results obtained from processing recognition hypotheses alone. Incorporating eye gaze with word confusion networks further improves performance.

Reference: text


Summary: the most important sentenses genereted by tfidf model

sentIndex sentText sentNum sentScore

1 In addition to incorporating eye gaze with the best recognized spoken hypothesis, we developed an algorithm to also handle multiple hypotheses modeled as word confusion networks. [sent-7, score-1.429]

2 Our empirical results demonstrate that incorporating eye gaze with recognition hypotheses consistently outperforms the results obtained from processing recognition hypotheses alone. [sent-8, score-1.45]

3 Incorporating eye gaze with word confusion networks further improves performance. [sent-9, score-1.252]

4 Different 471 from traditional telephony-based spoken dialogue systems and multimodal conversational interfaces, situated dialogue supports immersion and mobility in a visually rich environment and encourages social and collaborative language use (Byron et al. [sent-12, score-0.538]

5 In situated dialogue, human users often need to make linguistic references, known as exophoric referring expressions (e. [sent-15, score-0.636]

6 To address this problem, motivated by psycholinguistic studies demonstrating a close relationship between language production and eye gaze, our previous work has incorporated naturally occurring eye gaze in reference resolution (Prasov and Chai, 2008). [sent-20, score-1.772]

7 Our findings have shown that eye gaze can partially compensate for limited language processing and domain modeling. [sent-21, score-1.073]

8 In situated dialogue, human speech and eye gaze patterns are much more complex. [sent-23, score-1.241]

9 Therefore, this paper explores new studies on incorporating eye gaze for exophoric reference resolution in a fully situated virtual enviProceMedITin,g Ms oasfs thaceh 2u0se1t0ts C,o UnSfAer,e n9c-e1 on O Ectmobpeir ic 2a0l1 M0. [sent-26, score-1.597]

10 In addition to incorporating eye gaze with the best recognized spoken hypothesis, we developed an algorithm to also handle multiple hypotheses modeled as word confusion networks. [sent-29, score-1.429]

11 Our empirical results have demonstrated the utility of eye gaze for reference resolution in situated dialogue. [sent-30, score-1.404]

12 Although eye gaze is much more noisy given the mobility of the user, our results have shown that incorporating eye gaze with recognition hypotheses consistently outperform the results obtained from processing recognition hypotheses alone. [sent-31, score-2.523]

13 In addition, incorporating eye gaze with word confusion networks further improves performance. [sent-32, score-1.287]

14 Our analysis also indicates that, although a word confusion network appears to be more complicated, the time complexity of its integration with eye gaze is well within the acceptable range for real-time applications. [sent-33, score-1.276]

15 — 2 Related Work Prior work in reference resolution within situated dialogue has focused on using visual context to assist reference resolution during interaction. [sent-34, score-0.718]

16 In contrast to this line of research, here we explore the use of human eye gaze during real-time interaction to model attention and facilitate reference resolution. [sent-38, score-1.162]

17 Eye gaze provides a richer medium for attentional information, but requires processing of a potentially noisy signal. [sent-39, score-0.636]

18 Eye gaze has been used to facilitate human machine conversation and automated language processing. [sent-40, score-0.604]

19 For example, eye gaze has been studied in embodied conversational discourse as a mechanism to gather visual information, aid in thinking, or facilitate turn taking and engagement (Nakano et al. [sent-41, score-1.197]

20 Recent work has explored incorporating eye gaze into automated language understanding such as au- tomated speech recognition (Qu and Chai, 2007; Cooke and Russell, 2008), automated vocabulary acquisition (Liu et al. [sent-46, score-1.223]

21 Motivated by previous psycholinguistic findings that eye gaze is tightly linked with language processing (Just and Carpenter, 1976; Tanenhous et al. [sent-49, score-1.091]

22 , 1995; Meyer and Levelt, 1998; Griffin and Bock, 2000), our prior work incorporates eye gaze into reference resolution. [sent-50, score-1.162]

23 Our results demonstrate that such use of eye gaze can potentially compensate for a conversational systems limited language processing and domain modeling capability (Prasov and Chai, 2008). [sent-51, score-1.117]

24 In situated dialogue, eye gaze behavior is much more complex. [sent-53, score-1.192]

25 Here, gaze fixations may be made for the purpose of navigation or scanning the environment rather than referring to a particular object. [sent-54, score-1.021]

26 Therefore, the focus of our work here is on exploring these complex user behaviors in situated dialogue and examining how to combine eye gaze with ASR hypotheses for improved reference resolution. [sent-61, score-1.558]

27 In the study presented here, we apply word confusion networks (to represent ASR hypotheses) along with eye gaze to the problem of reference resolution. [sent-66, score-1.341]

28 During the experiments, a noise-canceling microphone was used to record user speech and the Tobii 1750 display-mounted eye tracker was used to record user eye movements. [sent-73, score-1.091]

29 Here, the user’s eye fixation is represented by the white dot and saccades (eye movements) are represented by white lines. [sent-75, score-0.501]

30 We focus on resolving exophoric referring expressions, which are enclosed in brackets here. [sent-81, score-0.451]

31 In our dataset, an exophoric referring expression is a non-pronominal noun phrase that refers to an entity in the extralinguistic environment. [sent-82, score-0.562]

32 In our study we focus on resolving exophoric referring expressions because they are tightly coupled with a user’s eye gaze behavior. [sent-88, score-1.637]

33 From this study, we constructed a parallel spoken utterance and eye gaze corpus. [sent-89, score-1.202]

34 Gaze fixations are characterized by objects in the virtual world that are fixated via a user’s eye gaze. [sent-97, score-0.697]

35 The data corpus was transcribed and annotated with 2204 exophoric referring expressions amongst 2052 utterances from 15 users. [sent-99, score-0.582]

36 In this case, the system must first identify one long sword as a referring expression and then resolve it to the correct set of entities in the virtual world. [sent-107, score-0.613]

37 However, not until the twenty fifth ranked recognition hypothesis H25, do we see a referring expression closest to the actual uttered referring expression. [sent-108, score-0.872]

38 Moreover, in utterances with multiple referring expressions, there may not be a single recognition hypothesis that contains all referring expressions, but each referring expression may be contained in some recognition hypothesis. [sent-109, score-1.333]

39 I one long sword” along with a timeline see (in milliseconds) depicting the eye gaze fixations to potential referent objects that correspond to the utterance. [sent-118, score-1.322]

40 Using our data set, we can show that word confusion networks contain significantly more words that can compose a referring expression than the top recognition hypothesis. [sent-120, score-0.696]

41 This is not only useful for efficient syn474 tactic parsing, which is necessary for identifying referring expressions, but also critical for integration with time aligned gaze streams. [sent-129, score-0.938]

42 5 Reference Resolution Algorithm We have developed an algorithm that combines an n-best list of speech recognition hypotheses with dialogue, domain, and eye-gaze information to resolve exophoric referring expressions. [sent-130, score-0.668]

43 Since during the treasure hunting task people typically only speak about objects that are visible or have recently been visible on the screen, an object is considered to be a potential referent if it is present within a close proximity (in the same room) of the user while an utterance is spoken. [sent-132, score-0.574]

44 Next, a set of all exophoric referring expressions (i. [sent-152, score-0.499]

45 Each referring expression has a corresponding confidence score, which can be computed in many many different ways. [sent-155, score-0.466]

46 Step 3: resolve referring expressions Each referring expression rj is resolved to the top k potential referent objects according to the probability P(oi |rj), where k is determined by information from the| linguistic expressions. [sent-158, score-1.153]

47 0] ms relative to the beginning oWf referring expression rj. [sent-163, score-0.451]

48 • • Compat: Compatibility score, which specifies Cwohmethpeart t:he C object oi i tsy compatible cwhit shp tehceif i en-s formation specified by the referring expression rj. [sent-164, score-0.563]

49 Currently, the compatibility score is set to 1 if referring expression rj and object oi have the same object type (e. [sent-165, score-0.704]

50 α: A high α value indicates that the attentional salience score based on eye gaze carries more weight in deciding referents, while a low α value indicates that compatibility carries more weight. [sent-171, score-1.163]

51 If we do not want to integrate eye gaze in reference resolution, we can set α = 0. [sent-174, score-1.162]

52 Once all probabilities are calculated, each referring expression is resolved to a set of referent objects. [sent-177, score-0.614]

53 The second component is the probability that the referent object set is indeed the referent of this expression (which is determined by Equation 1). [sent-180, score-0.467]

54 For example, in Table 3, the referring expressions one l ong sword and wine overlap in position 15. [sent-186, score-0.514]

55 Finally, the resulting (referring expression, referent object set) pairs are sorted in ascending order according to their constituent referring expression timestamps. [sent-187, score-0.666]

56 For each utterance we compare the reference resolution performance with and without the integration of eye gaze information. [sent-191, score-1.38]

57 We perform the following two types of evaluation: • Lenient Evaluation: Due to speech recognition errors, tt Ehevrael are many cases eine cwhh reiccho gthniet oalngorithm may not return a referring expression that exactly matches the gold standard refer- ring expression. [sent-197, score-0.566]

58 For applications in which it is critical to identify the objects referred to by the user, precisely identifying uttered referring expressions may be unnecessary. [sent-200, score-0.468]

59 Thus, we evaluate the reference resolution algorithm with a lenient comparison of (referring expression, referent object set) pairs. [sent-201, score-0.509]

60 In this case, two pairs are considered a match if at least the object types specified via the referring expressions match each other and the referent object sets are identical. [sent-202, score-0.707]

61 • Strict Evaluation: For some applications it may bSter important oton identify ee xapacptli referring expressions in addition to the objects they refer to. [sent-203, score-0.468]

62 Similarly, in systems that apply priming for language generation, identification of the exact referring expressions from human users could be important. [sent-206, score-0.43]

63 Thus, we also evaluate the reference resolution algorithm with a strict comparison of (referring expression, referent object set) pairs. [sent-207, score-0.464]

64 In 477 this case, a referring expression from the system output needs to exactly match the corresponding expression from the gold standard. [sent-208, score-0.568]

65 2 Role of Eye Gaze We evaluate the effect of incorporating eye gaze information into the reference resolution algorithm using the top best recognition hypothesis (1-best), the word confusion network (WCN), and the manual speech transcription (Transcription). [sent-210, score-1.692]

66 When no gaze information is used, reference resolution solely depends on linguistic and semantic processing of referring expressions. [sent-212, score-1.15]

67 This table demonstrates that lenient reference resolution is improved by incorporating eye gaze information. [sent-214, score-1.402]

68 As can be seen in the table, incorporating eye gaze information significantly (p < 0. [sent-224, score-1.108]

69 Since eye gaze can be used to direct navigation in a mobile environment as in situated dialogue, there could be situations where eye gaze does not reflect the content of the corresponding speech. [sent-229, score-2.308]

70 In such situations, integrating eye gaze in reference resolution could be detrimental. [sent-230, score-1.285]

71 To further understand the role of eye gaze in reference resolution, we applied our reference resolution algorithm only to utterances where speech and eye gaze are considered closely coupled (i. [sent-231, score-2.606]

72 More specifically, following the previous work (Qu and Chai, 2010), we define a closely coupled utterance as one in which at least one noun or adjective describes an object that has been fixated by the corresponding gaze stream. [sent-234, score-0.868]

73 In the lenient evaluation, reference resolution performance is significantly improved for all input configurations when eye gaze information is incorporated (p < 0. [sent-236, score-1.367]

74 This observation indicates that in situated dialogue, some mechanism to predict whether a gaze stream is closely coupled with the corresponding speech content can be beneficial in further improving reference resolution performance. [sent-244, score-1.033]

75 3 Role of Word Confusion Network The effect of incorporating eye gaze with WCNs rather than 1-best recognition hypotheses into reference resolution can also be seen in Tables 4 and 5. [sent-248, score-1.491]

76 In this figure the resolution performance (in terms of lenient evaluation) for WCNs of varying depth is shown as dashed lines for with and without eye gaze configurations. [sent-266, score-1.323]

77 2 we have shown that incorporating eye gaze information improves reference resolution performance. [sent-271, score-1.32]

78 Eye-gaze information is particu- Figure 3: Lenient F-measure at each WCN Depth larly helpful for resolving referring expressions that are ambiguous from the perspective of the artificial agent. [sent-272, score-0.442]

79 Consider a scenario where the user utters a referring expression that has an equivalent semantic compatibility with multiple potential referent objects. [sent-273, score-0.707]

80 Without eye gaze information, the semantic compatibility alone could be insufficient to resolve this referring expression. [sent-276, score-1.486]

81 Thus, when eye gaze information is incorporated, the main source of performance improvement comes from better identification of potential referent objects. [sent-277, score-1.226]

82 3 we have shown that incorporating multiple speech recognition hypotheses in the form of a word confusion network further improves reference resolution performance. [sent-279, score-0.67]

83 This is especially true when exact referring expression identification is required (F-measure of 0. [sent-280, score-0.451]

84 Consider a scenario where the top recognition hypothesis of an ut- terance contains no referring expressions or an incorrect referring expression that has no semantically compatible potential referent objects. [sent-284, score-1.103]

85 If a referring expression with a high compatibility value to some potential referent object is present in a lower probability hypothesis, this referring expression can only be identified when a WCN rather than a 1-best hypothesis is utilized. [sent-285, score-1.191]

86 Thus, when word confusion net479 works are incorporated, the main source of performance improvement comes from better referring expression identification. [sent-286, score-0.585]

87 Also, both the number of words in an input utterance ASR hypothesis |w| and the number of referring expressions pino a wesoisrd | wco|n afnudsi tohne nnuetmwobrekr |r| are dependent on ount-s tienr aan wcoer length. [sent-292, score-0.528]

88 700 when eye gaze is utilized for resolving closely coupled utterances). [sent-313, score-1.152]

89 Specifically, we discuss two types of error: (1) a referring expression is incorrectly recognized or (2) a recognized referring expression is not resolved to a correct referent object set. [sent-315, score-1.241]

90 Given transcribed data, which simulates perfectly recognized utterances, all referring expression recognition errors arise due to incorrect language processing. [sent-316, score-0.587]

91 Object set identification errors are more prevalent than referring expression recognition errors. [sent-322, score-0.517]

92 The majority of these errors occur because a referring expression is ambiguous from the perspective of the conversational system and there is not enough information to choose amongst multiple potential referent objects due to limited speech recognition and domain modeling. [sent-323, score-0.819]

93 Some of these errors can be avoided when eye gaze information is available to the system. [sent-330, score-1.073]

94 However, due to the noisy nature of eye gaze data, many such referring expressions remain ambiguous even when eye gaze information is considered. [sent-331, score-2.558]

95 8 Conclusion In this work, we have examined the utility of eye gaze and word confusion networks for reference resolution in situated dialogue within a virtual world. [sent-332, score-1.774]

96 Our empirical results indicate that incorporating eye gaze information with recognition hypotheses is beneficial for the reference resolution task compared to only using recognition hypotheses. [sent-333, score-1.557]

97 Furthermore, using a word confusion network rather than the top best recognition hypothesis further improves reference resolution performance. [sent-334, score-0.502]

98 Our findings also demonstrate that the processing speed necessary to integrate word confusion networks with eye gaze information is well within the acceptable range for real-time applications. [sent-335, score-1.252]

99 Between linguistic attention and gaze fixations in multimodal conversational interfaces. [sent-380, score-0.746]

100 An exploration of eye gaze in spoken language processing for multimodal conversational interfaces. [sent-478, score-1.209]


similar papers computed by tfidf model

tfidf for this paper:

wordName wordTfidf (topN-words)

[('gaze', 0.604), ('eye', 0.469), ('referring', 0.334), ('wcn', 0.167), ('referent', 0.135), ('confusion', 0.134), ('resolution', 0.123), ('dialogue', 0.12), ('situated', 0.119), ('expression', 0.117), ('hypotheses', 0.105), ('chai', 0.095), ('utterance', 0.095), ('reference', 0.089), ('exophoric', 0.087), ('lenient', 0.082), ('object', 0.08), ('expressions', 0.078), ('virtual', 0.071), ('network', 0.069), ('recognition', 0.066), ('prasov', 0.064), ('sword', 0.064), ('wcns', 0.064), ('utterances', 0.061), ('multimodal', 0.058), ('objects', 0.056), ('visual', 0.055), ('user', 0.052), ('speech', 0.049), ('treasure', 0.048), ('recognized', 0.048), ('depth', 0.045), ('networks', 0.045), ('conversational', 0.044), ('environment', 0.043), ('competing', 0.041), ('fixated', 0.04), ('fixations', 0.04), ('hunting', 0.04), ('strict', 0.037), ('incorporating', 0.035), ('compatibility', 0.035), ('asr', 0.035), ('coupled', 0.035), ('spoken', 0.034), ('transcription', 0.033), ('attentional', 0.032), ('configurationwithout', 0.032), ('cooke', 0.032), ('fixation', 0.032), ('gazewith', 0.032), ('oi', 0.032), ('resolving', 0.03), ('resolved', 0.028), ('interfaces', 0.028), ('mangu', 0.027), ('resolve', 0.027), ('rj', 0.026), ('visible', 0.025), ('embodied', 0.025), ('byron', 0.024), ('extralinguistic', 0.024), ('spatially', 0.024), ('qu', 0.024), ('agents', 0.023), ('salience', 0.023), ('transcribed', 0.022), ('hypothesis', 0.021), ('spatial', 0.021), ('world', 0.021), ('wine', 0.02), ('psycholinguistic', 0.018), ('position', 0.018), ('users', 0.018), ('potential', 0.018), ('insufficient', 0.017), ('room', 0.016), ('book', 0.016), ('axe', 0.016), ('bickmore', 0.016), ('cassell', 0.016), ('cohort', 0.016), ('griffin', 0.016), ('icmi', 0.016), ('kelleher', 0.016), ('kwer', 0.016), ('meyer', 0.016), ('morency', 0.016), ('nakano', 0.016), ('navigate', 0.016), ('qvarfordt', 0.016), ('sidner', 0.016), ('sofa', 0.016), ('tanenhous', 0.016), ('utters', 0.016), ('chair', 0.016), ('chart', 0.015), ('confidence', 0.015), ('closely', 0.014)]

similar papers list:

simIndex simValue paperId paperTitle

same-paper 1 0.99999946 53 emnlp-2010-Fusing Eye Gaze with Speech Recognition Hypotheses to Resolve Exophoric References in Situated Dialogue

Author: Zahar Prasov ; Joyce Y. Chai

Abstract: In situated dialogue humans often utter linguistic expressions that refer to extralinguistic entities in the environment. Correctly resolving these references is critical yet challenging for artificial agents partly due to their limited speech recognition and language understanding capabilities. Motivated by psycholinguistic studies demonstrating a tight link between language production and human eye gaze, we have developed approaches that integrate naturally occurring human eye gaze with speech recognition hypotheses to resolve exophoric references in situated dialogue in a virtual world. In addition to incorporating eye gaze with the best recognized spoken hypothesis, we developed an algorithm to also handle multiple hypotheses modeled as word confusion networks. Our empirical results demonstrate that incorporating eye gaze with recognition hypotheses consistently outperforms the results obtained from processing recognition hypotheses alone. Incorporating eye gaze with word confusion networks further improves performance.

2 0.12833893 26 emnlp-2010-Classifying Dialogue Acts in One-on-One Live Chats

Author: Su Nam Kim ; Lawrence Cavedon ; Timothy Baldwin

Abstract: We explore the task of automatically classifying dialogue acts in 1-on-1 online chat forums, an increasingly popular means of providing customer service. In particular, we investigate the effectiveness of various features and machine learners for this task. While a simple bag-of-words approach provides a solid baseline, we find that adding information from dialogue structure and inter-utterance dependency provides some increase in performance; learners that account for sequential dependencies (CRFs) show the best performance. We report our results from testing using a corpus of chat dialogues derived from online shopping customer-feedback data.

3 0.067172319 4 emnlp-2010-A Game-Theoretic Approach to Generating Spatial Descriptions

Author: Dave Golland ; Percy Liang ; Dan Klein

Abstract: Language is sensitive to both semantic and pragmatic effects. To capture both effects, we model language use as a cooperative game between two players: a speaker, who generates an utterance, and a listener, who responds with an action. Specifically, we consider the task of generating spatial references to objects, wherein the listener must accurately identify an object described by the speaker. We show that a speaker model that acts optimally with respect to an explicit, embedded listener model substantially outperforms one that is trained to directly generate spatial descriptions.

4 0.05921302 107 emnlp-2010-Towards Conversation Entailment: An Empirical Investigation

Author: Chen Zhang ; Joyce Chai

Abstract: While a significant amount of research has been devoted to textual entailment, automated entailment from conversational scripts has received less attention. To address this limitation, this paper investigates the problem of conversation entailment: automated inference of hypotheses from conversation scripts. We examine two levels of semantic representations: a basic representation based on syntactic parsing from conversation utterances and an augmented representation taking into consideration of conversation structures. For each of these levels, we further explore two ways of capturing long distance relations between language constituents: implicit modeling based on the length of distance and explicit modeling based on actual patterns of relations. Our empirical findings have shown that the augmented representation with conversation structures is important, which achieves the best performance when combined with explicit modeling of long distance relations.

5 0.055018399 25 emnlp-2010-Better Punctuation Prediction with Dynamic Conditional Random Fields

Author: Wei Lu ; Hwee Tou Ng

Abstract: This paper focuses on the task of inserting punctuation symbols into transcribed conversational speech texts, without relying on prosodic cues. We investigate limitations associated with previous methods, and propose a novel approach based on dynamic conditional random fields. Different from previous work, our proposed approach is designed to jointly perform both sentence boundary and sentence type prediction, and punctuation prediction on speech utterances. We performed evaluations on a transcribed conversational speech domain consisting of both English and Chinese texts. Empirical results show that our method outperforms an approach based on linear-chain conditional random fields and other previous approaches.

6 0.051237006 122 emnlp-2010-WikiWars: A New Corpus for Research on Temporal Expressions

7 0.039847456 18 emnlp-2010-Assessing Phrase-Based Translation Models with Oracle Decoding

8 0.038940728 84 emnlp-2010-NLP on Spoken Documents Without ASR

9 0.036942378 54 emnlp-2010-Generating Confusion Sets for Context-Sensitive Error Correction

10 0.033254527 19 emnlp-2010-Automatic Analysis of Rhythmic Poetry with Applications to Generation and Translation

11 0.032481331 14 emnlp-2010-A Tree Kernel-Based Unified Framework for Chinese Zero Anaphora Resolution

12 0.028688202 8 emnlp-2010-A Multi-Pass Sieve for Coreference Resolution

13 0.028272351 95 emnlp-2010-SRL-Based Verb Selection for ESL

14 0.025644217 51 emnlp-2010-Function-Based Question Classification for General QA

15 0.025303468 63 emnlp-2010-Improving Translation via Targeted Paraphrasing

16 0.024779867 65 emnlp-2010-Inducing Probabilistic CCG Grammars from Logical Form with Higher-Order Unification

17 0.024328627 16 emnlp-2010-An Approach of Generating Personalized Views from Normalized Electronic Dictionaries : A Practical Experiment on Arabic Language

18 0.022944707 29 emnlp-2010-Combining Unsupervised and Supervised Alignments for MT: An Empirical Study

19 0.021670412 86 emnlp-2010-Non-Isomorphic Forest Pair Translation

20 0.020896893 75 emnlp-2010-Lessons Learned in Part-of-Speech Tagging of Conversational Speech


similar papers computed by lsi model

lsi for this paper:

topicId topicWeight

[(0, 0.086), (1, 0.023), (2, -0.007), (3, 0.06), (4, -0.045), (5, -0.006), (6, 0.024), (7, -0.071), (8, -0.017), (9, 0.007), (10, -0.173), (11, -0.162), (12, 0.024), (13, 0.138), (14, 0.114), (15, -0.119), (16, -0.076), (17, -0.017), (18, -0.079), (19, 0.053), (20, -0.173), (21, -0.001), (22, -0.054), (23, 0.071), (24, -0.072), (25, 0.011), (26, 0.236), (27, -0.094), (28, -0.1), (29, 0.222), (30, -0.163), (31, -0.044), (32, 0.097), (33, 0.061), (34, 0.02), (35, 0.064), (36, -0.007), (37, -0.251), (38, -0.138), (39, 0.007), (40, 0.02), (41, -0.096), (42, -0.093), (43, -0.112), (44, 0.094), (45, -0.092), (46, 0.071), (47, -0.047), (48, -0.057), (49, 0.139)]

similar papers list:

simIndex simValue paperId paperTitle

same-paper 1 0.97522646 53 emnlp-2010-Fusing Eye Gaze with Speech Recognition Hypotheses to Resolve Exophoric References in Situated Dialogue

Author: Zahar Prasov ; Joyce Y. Chai

Abstract: In situated dialogue humans often utter linguistic expressions that refer to extralinguistic entities in the environment. Correctly resolving these references is critical yet challenging for artificial agents partly due to their limited speech recognition and language understanding capabilities. Motivated by psycholinguistic studies demonstrating a tight link between language production and human eye gaze, we have developed approaches that integrate naturally occurring human eye gaze with speech recognition hypotheses to resolve exophoric references in situated dialogue in a virtual world. In addition to incorporating eye gaze with the best recognized spoken hypothesis, we developed an algorithm to also handle multiple hypotheses modeled as word confusion networks. Our empirical results demonstrate that incorporating eye gaze with recognition hypotheses consistently outperforms the results obtained from processing recognition hypotheses alone. Incorporating eye gaze with word confusion networks further improves performance.

2 0.77618271 26 emnlp-2010-Classifying Dialogue Acts in One-on-One Live Chats

Author: Su Nam Kim ; Lawrence Cavedon ; Timothy Baldwin

Abstract: We explore the task of automatically classifying dialogue acts in 1-on-1 online chat forums, an increasingly popular means of providing customer service. In particular, we investigate the effectiveness of various features and machine learners for this task. While a simple bag-of-words approach provides a solid baseline, we find that adding information from dialogue structure and inter-utterance dependency provides some increase in performance; learners that account for sequential dependencies (CRFs) show the best performance. We report our results from testing using a corpus of chat dialogues derived from online shopping customer-feedback data.

3 0.54363209 4 emnlp-2010-A Game-Theoretic Approach to Generating Spatial Descriptions

Author: Dave Golland ; Percy Liang ; Dan Klein

Abstract: Language is sensitive to both semantic and pragmatic effects. To capture both effects, we model language use as a cooperative game between two players: a speaker, who generates an utterance, and a listener, who responds with an action. Specifically, we consider the task of generating spatial references to objects, wherein the listener must accurately identify an object described by the speaker. We show that a speaker model that acts optimally with respect to an explicit, embedded listener model substantially outperforms one that is trained to directly generate spatial descriptions.

4 0.34001279 122 emnlp-2010-WikiWars: A New Corpus for Research on Temporal Expressions

Author: Pawel Mazur ; Robert Dale

Abstract: The reliable extraction of knowledge from text requires an appropriate treatment of the time at which reported events take place. Unfortunately, there are very few annotated data sets that support the development of techniques for event time-stamping and tracking the progression of time through a narrative. In this paper, we present a new corpus of temporally-rich documents sourced from English Wikipedia, which we have annotated with TIMEX2 tags. The corpus contains around 120000 tokens, and 2600 TIMEX2 expressions, thus comparing favourably in size to other existing corpora used in these areas. We describe the prepa- ration of the corpus, and compare the profile of the data with other existing temporally annotated corpora. We also report the results obtained when we use DANTE, our temporal expression tagger, to process this corpus, and point to where further work is required. The corpus is publicly available for research purposes.

5 0.30069441 14 emnlp-2010-A Tree Kernel-Based Unified Framework for Chinese Zero Anaphora Resolution

Author: Fang Kong ; Guodong Zhou

Abstract: This paper proposes a unified framework for zero anaphora resolution, which can be divided into three sub-tasks: zero anaphor detection, anaphoricity determination and antecedent identification. In particular, all the three sub-tasks are addressed using tree kernel-based methods with appropriate syntactic parse tree structures. Experimental results on a Chinese zero anaphora corpus show that the proposed tree kernel-based methods significantly outperform the feature-based ones. This indicates the critical role of the structural information in zero anaphora resolution and the necessity of tree kernel-based methods in modeling such structural information. To our best knowledge, this is the first systematic work dealing with all the three sub-tasks in Chinese zero anaphora resolution via a unified framework. Moreover, we release a Chinese zero anaphora corpus of 100 documents, which adds a layer of annotation to the manu- ally-parsed sentences in the Chinese Treebank (CTB) 6.0.

6 0.14536329 25 emnlp-2010-Better Punctuation Prediction with Dynamic Conditional Random Fields

7 0.13546267 8 emnlp-2010-A Multi-Pass Sieve for Coreference Resolution

8 0.13533315 23 emnlp-2010-Automatic Keyphrase Extraction via Topic Decomposition

9 0.12404389 24 emnlp-2010-Automatically Producing Plot Unit Representations for Narrative Text

10 0.12080649 84 emnlp-2010-NLP on Spoken Documents Without ASR

11 0.11509758 18 emnlp-2010-Assessing Phrase-Based Translation Models with Oracle Decoding

12 0.11127383 91 emnlp-2010-Practical Linguistic Steganography Using Contextual Synonym Substitution and Vertex Colour Coding

13 0.10521694 54 emnlp-2010-Generating Confusion Sets for Context-Sensitive Error Correction

14 0.10070839 65 emnlp-2010-Inducing Probabilistic CCG Grammars from Logical Form with Higher-Order Unification

15 0.099448301 19 emnlp-2010-Automatic Analysis of Rhythmic Poetry with Applications to Generation and Translation

16 0.099350706 29 emnlp-2010-Combining Unsupervised and Supervised Alignments for MT: An Empirical Study

17 0.096936673 107 emnlp-2010-Towards Conversation Entailment: An Empirical Investigation

18 0.09497799 111 emnlp-2010-Two Decades of Unsupervised POS Induction: How Far Have We Come?

19 0.092577845 55 emnlp-2010-Handling Noisy Queries in Cross Language FAQ Retrieval

20 0.091441922 16 emnlp-2010-An Approach of Generating Personalized Views from Normalized Electronic Dictionaries : A Practical Experiment on Arabic Language


similar papers computed by lda model

lda for this paper:

topicId topicWeight

[(3, 0.018), (10, 0.02), (12, 0.036), (29, 0.056), (30, 0.021), (32, 0.024), (39, 0.014), (52, 0.018), (56, 0.068), (62, 0.013), (66, 0.047), (72, 0.072), (76, 0.07), (79, 0.016), (87, 0.012), (89, 0.015), (93, 0.371)]

similar papers list:

simIndex simValue paperId paperTitle

same-paper 1 0.73075104 53 emnlp-2010-Fusing Eye Gaze with Speech Recognition Hypotheses to Resolve Exophoric References in Situated Dialogue

Author: Zahar Prasov ; Joyce Y. Chai

Abstract: In situated dialogue humans often utter linguistic expressions that refer to extralinguistic entities in the environment. Correctly resolving these references is critical yet challenging for artificial agents partly due to their limited speech recognition and language understanding capabilities. Motivated by psycholinguistic studies demonstrating a tight link between language production and human eye gaze, we have developed approaches that integrate naturally occurring human eye gaze with speech recognition hypotheses to resolve exophoric references in situated dialogue in a virtual world. In addition to incorporating eye gaze with the best recognized spoken hypothesis, we developed an algorithm to also handle multiple hypotheses modeled as word confusion networks. Our empirical results demonstrate that incorporating eye gaze with recognition hypotheses consistently outperforms the results obtained from processing recognition hypotheses alone. Incorporating eye gaze with word confusion networks further improves performance.

2 0.31682286 26 emnlp-2010-Classifying Dialogue Acts in One-on-One Live Chats

Author: Su Nam Kim ; Lawrence Cavedon ; Timothy Baldwin

Abstract: We explore the task of automatically classifying dialogue acts in 1-on-1 online chat forums, an increasingly popular means of providing customer service. In particular, we investigate the effectiveness of various features and machine learners for this task. While a simple bag-of-words approach provides a solid baseline, we find that adding information from dialogue structure and inter-utterance dependency provides some increase in performance; learners that account for sequential dependencies (CRFs) show the best performance. We report our results from testing using a corpus of chat dialogues derived from online shopping customer-feedback data.

3 0.30753532 40 emnlp-2010-Effects of Empty Categories on Machine Translation

Author: Tagyoung Chung ; Daniel Gildea

Abstract: We examine effects that empty categories have on machine translation. Empty categories are elements in parse trees that lack corresponding overt surface forms (words) such as dropped pronouns and markers for control constructions. We start by training machine translation systems with manually inserted empty elements. We find that inclusion of some empty categories in training data improves the translation result. We expand the experiment by automatically inserting these elements into a larger data set using various methods and training on the modified corpus. We show that even when automatic prediction of null elements is not highly accurate, it nevertheless improves the end translation result.

4 0.3071698 32 emnlp-2010-Context Comparison of Bursty Events in Web Search and Online Media

Author: Yunliang Jiang ; Cindy Xide Lin ; Qiaozhu Mei

Abstract: In this paper, we conducted a systematic comparative analysis of language in different contexts of bursty topics, including web search, news media, blogging, and social bookmarking. We analyze (1) the content similarity and predictability between contexts, (2) the coverage of search content by each context, and (3) the intrinsic coherence of information in each context. Our experiments show that social bookmarking is a better predictor to the bursty search queries, but news media and social blogging media have a much more compelling coverage. This comparison provides insights on how the search behaviors and social information sharing behaviors of users are correlated to the professional news media in the context of bursty events.

5 0.29650497 117 emnlp-2010-Using Unknown Word Techniques to Learn Known Words

Author: Kostadin Cholakov ; Gertjan van Noord

Abstract: Unknown words are a hindrance to the performance of hand-crafted computational grammars of natural language. However, words with incomplete and incorrect lexical entries pose an even bigger problem because they can be the cause of a parsing failure despite being listed in the lexicon of the grammar. Such lexical entries are hard to detect and even harder to correct. We employ an error miner to pinpoint words with problematic lexical entries. An automated lexical acquisition technique is then used to learn new entries for those words which allows the grammar to parse previously uncovered sentences successfully. We test our method on a large-scale grammar of Dutch and a set of sentences for which this grammar fails to produce a parse. The application of the method enables the grammar to cover 83.76% of those sentences with an accuracy of 86.15%.

6 0.29625568 121 emnlp-2010-What a Parser Can Learn from a Semantic Role Labeler and Vice Versa

7 0.29562098 65 emnlp-2010-Inducing Probabilistic CCG Grammars from Logical Form with Higher-Order Unification

8 0.29516256 107 emnlp-2010-Towards Conversation Entailment: An Empirical Investigation

9 0.29331708 105 emnlp-2010-Title Generation with Quasi-Synchronous Grammar

10 0.2904548 82 emnlp-2010-Multi-Document Summarization Using A* Search and Discriminative Learning

11 0.28855062 24 emnlp-2010-Automatically Producing Plot Unit Representations for Narrative Text

12 0.28774232 78 emnlp-2010-Minimum Error Rate Training by Sampling the Translation Lattice

13 0.28737527 16 emnlp-2010-An Approach of Generating Personalized Views from Normalized Electronic Dictionaries : A Practical Experiment on Arabic Language

14 0.28700224 122 emnlp-2010-WikiWars: A New Corpus for Research on Temporal Expressions

15 0.28668293 46 emnlp-2010-Evaluating the Impact of Alternative Dependency Graph Encodings on Solving Event Extraction Tasks

16 0.28650686 98 emnlp-2010-Soft Syntactic Constraints for Hierarchical Phrase-Based Translation Using Latent Syntactic Distributions

17 0.28632489 120 emnlp-2010-What's with the Attitude? Identifying Sentences with Attitude in Online Discussions

18 0.28565332 35 emnlp-2010-Discriminative Sample Selection for Statistical Machine Translation

19 0.28564268 17 emnlp-2010-An Efficient Algorithm for Unsupervised Word Segmentation with Branching Entropy and MDL

20 0.28550127 48 emnlp-2010-Exploiting Conversation Structure in Unsupervised Topic Segmentation for Emails