emnlp emnlp2012 emnlp2012-23 knowledge-graph by maker-knowledge-mining
Source: pdf
Author: Jordan Boyd-Graber ; Brianna Satinoff ; He He ; Hal Daume III
Abstract: Cost-sensitive classification, where thefeatures used in machine learning tasks have a cost, has been explored as a means of balancing knowledge against the expense of incrementally obtaining new features. We introduce a setting where humans engage in classification with incrementally revealed features: the collegiate trivia circuit. By providing the community with a web-based system to practice, we collected tens of thousands of implicit word-by-word ratings of how useful features are for eliciting correct answers. Observing humans’ classification process, we improve the performance of a state-of-the art classifier. We also use the dataset to evaluate a system to compete in the incremental classification task through a reduction of reinforcement learning to classification. Our system learns when to answer a question, performing better than baselines and most human players.
Reference: text
sentIndex sentText sentNum sentScore
1 We introduce a setting where humans engage in classification with incrementally revealed features: the collegiate trivia circuit. [sent-4, score-0.427]
2 We also use the dataset to evaluate a system to compete in the incremental classification task through a reduction of reinforcement learning to classification. [sent-7, score-0.41]
3 We discuss the incremental classification framework in Section 2. [sent-23, score-0.298]
4 Our understanding of how humans conduct incremental classification is limited. [sent-24, score-0.42]
5 Instead, we adapt a real world setting where humans are already engaging (eagerly) in incremental classification—trivia games—and develop a cheap, easy method for capturing human incremental classification judgments. [sent-26, score-0.629]
6 After qualitatively examining how humans conduct incremental classification (Section 3), we show that knowledge of a human’s incremental classifi- cation process improves state-of-the-art rapacious classification (Section 4). [sent-27, score-0.931]
7 2 Incremental Classification In this section, we discuss previous approaches that explore how much effort or resources a classifier needs to come to a decision, a problem not as thoroughly examined as the question of whether the decision is right or Incremental classification is not. [sent-30, score-0.323]
8 In contrast, incremental classification allows the learner to decide whether to acquire additional features. [sent-39, score-0.298]
9 A common paradigm for incremental classification is to view the problem as a Markov decision process (MDP) (Zubek and Dietterich, 2002). [sent-40, score-0.339]
10 The incremental classifier can either request an additional feature or render a classification decision (Chai et al. [sent-41, score-0.38]
11 In Section 5, we use a MDP to decide whether additional features need to be processed in our application of incremental classification to a trivia game. [sent-45, score-0.396]
12 1 Trivia as Incremental Classification A real-life setting where humans classify documents incrementally is quiz bowl, an academic competition between schools in English-speaking countries; hundreds of teams compete in dozens of tournaments each year (Jennings, 2006). [sent-47, score-0.54]
13 Note the distinction be- tween quiz bowl and Jeopardy, a recent application area (Ferrucci et al. [sent-48, score-0.47]
14 While Jeopardy also uses signaling devices, these are only usable after a question is completed (interrupting Jeopardy’s questions would make for bad television). [sent-50, score-0.31]
15 Thus, Jeopardy is rapacious classification followed by a race to see— among those who know the answer—who can punch a button first. [sent-51, score-0.302]
16 Teams interrupt the question at any point by “buzzing in”; if the answer is correct, the team gets points and the next question is read. [sent-58, score-0.571]
17 3 Figure 1: Quiz bowl question on William Jennings Bryan, a late nineteenth century American politician; obscure clues are at the beginning while more accessible clues are at the end. [sent-65, score-0.512]
18 Words (excluding stop words) are shaded based on the number of times the word triggered a buzz from any player who answered the question (darker means more buzzes; buzzes contribute to the shading of the previous five words). [sent-66, score-0.927]
19 The answers to quiz bowl questions are wellknown entities (e. [sent-68, score-0.764]
20 ), so the answer space is relatively limited; there are no open-ended questions of the form “why is the sky blue? [sent-71, score-0.417]
21 First, it is a real-world instance of incremental classification that happens hundreds of thousands of times most weekends. [sent-78, score-0.298]
22 Finally, quiz bowl’s inherent fun makes it easy to acquire human responses, as we describe in the next section. [sent-80, score-0.24]
23 Users that answered questions later in the question had higher accuracy. [sent-86, score-0.38]
24 However, there were users that were able to answer questions relatively early without sacrificing accuracy. [sent-87, score-0.467]
25 3 Getting a Buzz through Crowdsourcing We built a corpus with 37,225 quiz bowl questions with 25,498 distinct labels from 121 tournaments written for tournaments between 1999 and 2010. [sent-88, score-0.777]
26 We created a that simulates the experience of webapp4 playing quiz bowl. [sent-89, score-0.24]
27 Text is incrementally revealed (at a pace adjustable by the user) until users press the space bar to “buzz”. [sent-90, score-0.246]
28 , it’s okay to say that “wj bryan” is an acceptable answer for the label “william jennings bryan”, but “asdf” is not). [sent-95, score-0.362]
29 We did not see examples of nonsense answers from malicious users; in contrast, users were stricter than we expected, perhaps because protesting required effort. [sent-96, score-0.258]
30 , everyone saw a question on “Jonathan Swift” and then a question on “William Jennings Bryan”, but because these labels have many questions the specific questions 4Play online or download the datasets at http : / /umiacs . [sent-100, score-0.671]
31 Users see a question revealed one word at a time. [sent-104, score-0.236]
32 They signal buzzes by clicking on the answer button and input an answer. [sent-105, score-0.597]
33 Participants were eager to answer questions; over 7000 questions were answered in the first day, and over 43000 questions were answered in two weeks by 461 users. [sent-107, score-0.676]
34 To represent a “buzz”, we define a function b(q, f) (“b” for buzz) as the number of times that feature f occurred in question q at most five tokens before a user correctly buzzed on that question. [sent-108, score-0.311]
35 5 Aggregating buzzes across questions (summing over q) shows different features useful for eliciting a buzz (Figure 4(a)). [sent-109, score-0.863]
36 The buzzes reflect what users remember about the work and is more focused than the complete question text. [sent-119, score-0.585]
37 Heights” (Figure 4(c)) is much more focused than the word cloud for all of the words from the questions with that label (Figure 4(b)). [sent-120, score-0.235]
38 4 Buzzes Reveal Useful Features If we restrict ourselves to a finite set of labels, the process of answering questions is a multiclass classification problem. [sent-121, score-0.284]
39 In this section, we show that information gleaned from humans making a similar decision can help improve rapacious machine learning classification. [sent-122, score-0.376]
40 The weight combines buzz information (described in Section 3) and tf-idf (Salton, 1968). [sent-127, score-0.361]
41 09 Table 1: Classification error of a rapacious classifier able to draw on human incremental classification. [sent-143, score-0.509]
42 If only β is non-zero, number of buzzes is now a linear multiplier of the tf-idf weight (buzz-linear). [sent-147, score-0.344]
43 While not directly comparable (this classifier is rapacious, not incremental, and has a predefined answer space), the average user had an error rate of 16. [sent-152, score-0.297]
44 5 Building an Incremental Classifier In the previous section we improved rapacious classification using humans’ incremental classification. [sent-154, score-0.511]
45 A more interesting problem is how to compete against humans in incremental classification. [sent-155, score-0.391]
46 Doing so requires us to formulate an incremental representation of the contents of questions and to learn a strategy to decide when to buzz. [sent-157, score-0.367]
47 Because this is the first machine learning algorithm for quiz bowl, we attempt to provide reasonablerapaciousbaselines andcompare againstournew strategies. [sent-158, score-0.24]
48 In our context, a state is a sequence of (thus far revealed) tokens, and the action is whether to buzz or not. [sent-164, score-0.473]
49 Given examples of the correct answer given a configuration of the state space, we can learn a MDP without explicitly representing the reward function. [sent-170, score-0.275]
50 1 Action Space We assume that there are only two possible actions: buzz now or wait. [sent-173, score-0.361]
51 However, this conflates the question of when to buzz with what to answer. [sent-177, score-0.513]
52 Instead, we call the distinct component that provides what to answer the content model. [sent-178, score-0.382]
53 For the moment, assume that a content model maintains a posterior distribution over labels and when needed can provide its best guess (e. [sent-181, score-0.322]
54 The classifier attempts to learn that actioxn y →is y (“buzz”) in all states where the content model gave a correct response given state x. [sent-186, score-0.258]
55 For example, if you know your opponent is unlikely to answer a question, it is better to wait until you are more confident. [sent-195, score-0.368]
56 For example, if a right answer is worth +10 points and the penalty for an incorrect question is −5, then a team leading by 15 points on the last question should never attempt to answer. [sent-200, score-0.61]
57 We also investigated learning a policy directly from users’ buzzes directly (Abbeel and Ng, 2004), but this performed poorly because the content model is incompatible with the players’ abilities and the high variation in players’ ability and styles (compare Figure 2). [sent-203, score-0.571]
58 We use three components to form the state space: what information has been observed, what the content model believes is the correct answer, how confident the content model is, and whether the content model’s confidence is changing. [sent-208, score-0.541]
59 6 Guess An additional feature that we used to represent the state space is the current guess of the content model; i. [sent-211, score-0.303]
60 , 2008), a regression predicting how many individuals would buzz in the next n words, the year the question was written, the category of the question, etc. [sent-220, score-0.513]
61 ” We call the component of our model that answers this question the content model. [sent-225, score-0.45]
62 This generative model assumes labels for questions come from a multinomial distribution ∼ Dir(α) φ 6The phrase “for ten points” (abbreviated FTP) appears in all quiz bowl questions to signal the question’s last sentence or clause. [sent-227, score-0.87]
63 It is a signal to answer soon, as the final “giveaway” clue is next. [sent-228, score-0.253]
64 In addition to providing our answers, the content model also provides an additional, critically important feature for our state space: its posterior (pos for short) probability. [sent-236, score-0.279]
65 With every revealed feature, the content model updates its posterior distribution over labels given that t tokens have been revealed in question n, p(zn | w1 . [sent-237, score-0.636]
66 8 After demonstrating our ability to learn an incremental classifier using this simple content model, we extend the content model to capture local context and correlations between similar labels in Section 7. [sent-246, score-0.625]
67 We simulate competition by taking the human answers and buzzes as a given and ask our algorithm (independently) to provide its decision on when to buzz on a test set. [sent-257, score-0.882]
68 The indices were chosen as the quartiles for question length (by convention, most questions are of similar length). [sent-265, score-0.31]
69 We compare these baselines against policies that decide when to buzz based on the state. [sent-266, score-0.396]
70 To best simulate conventional quiz bowl settings, a correct answer was +10 and the incorrect answer was −5. [sent-268, score-0.949]
71 1296 Cases where the opponent buzzes first but is wrong are equivalent to rapacious classification, as there is no longer any incentive to answer early. [sent-270, score-0.932]
72 To focus on incremental classification, we exclude instances where the human interrupts with an incorrect answer, as after an opponent eliminates themselves, the answering reduces to rapacious classification. [sent-273, score-0.613]
73 While incremental algorithms outperform rapacious baselines, they lose to humans. [sent-275, score-0.422]
74 Although the content model is simple, this poor performance is not from the content model never producing the correct answer. [sent-277, score-0.324]
75 Thus, while the content model was able to come up with correct answers often enough to on average win against opponents (even the best human players), we were unable to consistently learn winning policies. [sent-280, score-0.364]
76 There are two ways to solve this problem: create deeper, more nuanced policies (or the features that feed into them) or refine content models that provide the signal needed for our policies to make sound decisions. [sent-281, score-0.265]
77 7 Expanding the Content Model When we asked quiz bowlers how they answer questions, they said that they first determine the category Table 3: Performance of strategies against users. [sent-283, score-0.496]
78 The human scoring columns show the average points per question (positive means winning on average, negative means losing on average) that the algorithm would expect to accumulate per question versus each human amalgam metric. [sent-284, score-0.304]
79 Ideally, the content model should conduct the same calculus—if a question seems to be about mathematics, all answers related with mathematics should be more likely in the posterior. [sent-287, score-0.45]
80 , answering “entropy” for “Jo- hannes Brahms”, when an answer such as “Robert Schumann”, another composer, would be better). [sent-290, score-0.257]
81 words, draw answer l ∼ Mult(φ) : (a) Assume w0 ≡ START (b) Draw wn ∼ Mult(θl,c,wn−1 ) for n ∈ {1. [sent-304, score-0.266]
82 We compare the na¨ıve model with models that capture more of the content in the text in Table 4; these results also include intermediate models between na¨ıve Bayes and the full content model: “cat” (omit 2. [sent-314, score-0.324]
83 They are about even against the mean and median players and lose four points per question against top players. [sent-320, score-0.307]
84 The question leads off with Ravel’s orchestral version of users, but with enhanced content models. [sent-326, score-0.314]
85 The first clues are about magnetic fields near a Fermi surface, which causes the content model to view “magnetic field” as the most likely answer. [sent-333, score-0.276]
86 , 2010), has a coreference system as its content model (Haghighi and Klein, 2007), or determines the correct question type (Moldovan et al. [sent-336, score-0.314]
87 The question is a very difficult question about George Washington, America’s first president. [sent-339, score-0.304]
88 To answer these types of question, 1298 Posterior Opponent maurice ravel 0 1. [sent-343, score-0.324]
89 24680gecnhar lesmcahmiglnster ysa unarilkanwgaub gtea 0 Prediction answer 10 20 30 40 Tokens Revealed Buzz 50 60 Observation feature Figure 5: Three questions where our algorithm performed poorly. [sent-346, score-0.378]
90 Lines represent the current estimate posterior probability of the answer (red) and the proportion of opponents who have answered the question correctly (cyan). [sent-352, score-0.57]
91 the repository used to train the content model would have to be orders of magnitude larger to be able to link the disparate clues in the question to a consistent target. [sent-356, score-0.379]
92 2 Assumptions We have made assumptions to solve a problem that is subtly different that the game of quiz bowl that a human would play. [sent-359, score-0.47]
93 On the other hand, to focus on incremental classification, we idealized our human opponents so that they never give incorrect answers (Section 6). [sent-363, score-0.45]
94 First, we introduce a new setting for exploring the problem of incremental classification: trivia games. [sent-366, score-0.307]
95 We took advantage of that ease and created a framework for quickly and efficiently gathering examples of humans doing incremental classification. [sent-368, score-0.331]
96 The second contribution shows that humans’ incremental classification improves state-of-the-art rapacious classification algorithms. [sent-374, score-0.6]
97 The problem of answering quiz bowl questions is itself a challenging task that combines issues from language modeling, large data, coreference, and reinforcement learning. [sent-377, score-0.717]
98 While we do not address all of these problems, our third contribution is a system that learns a policy in a MDP for incremental classification even in very large state spaces; it can successfully compete with skilled human players. [sent-378, score-0.478]
99 We are also interested in adding richer models of opponents to the state space that would adaptively adjust strategies as it learned more about the strengths and weaknesses of its opponent (Waugh et al. [sent-382, score-0.311]
100 Acknowledgments We thank the many players who played our online quiz bowl to provide our data (and hopefully had fun doing so) and Carlo Angiuli, Arnav Moudgil, and Jerry Vinokurov for providing access to quiz bowl questions. [sent-392, score-1.095]
wordName wordTfidf (topN-words)
[('buzz', 0.361), ('buzzes', 0.344), ('quiz', 0.24), ('bowl', 0.23), ('answer', 0.22), ('rapacious', 0.213), ('incremental', 0.209), ('content', 0.162), ('questions', 0.158), ('players', 0.155), ('question', 0.152), ('answers', 0.136), ('humans', 0.122), ('opponent', 0.115), ('jennings', 0.098), ('mdp', 0.098), ('trivia', 0.098), ('dir', 0.094), ('classification', 0.089), ('users', 0.089), ('revealed', 0.084), ('buzzed', 0.082), ('answered', 0.07), ('fermi', 0.066), ('opponents', 0.066), ('ravel', 0.066), ('bayes', 0.065), ('policy', 0.065), ('clues', 0.065), ('posterior', 0.062), ('compete', 0.06), ('action', 0.057), ('jeopardy', 0.056), ('state', 0.055), ('na', 0.054), ('ve', 0.054), ('reinforcement', 0.052), ('labels', 0.051), ('bryan', 0.05), ('abbeel', 0.049), ('buzzing', 0.049), ('enrico', 0.049), ('magnetic', 0.049), ('payoff', 0.049), ('tournaments', 0.049), ('guess', 0.047), ('team', 0.047), ('draw', 0.046), ('daume', 0.044), ('label', 0.044), ('silver', 0.042), ('classifier', 0.041), ('decision', 0.041), ('tokens', 0.041), ('wrong', 0.04), ('incorrect', 0.039), ('space', 0.039), ('chai', 0.038), ('maurice', 0.038), ('umd', 0.038), ('actions', 0.038), ('answering', 0.037), ('user', 0.036), ('strategies', 0.036), ('crowdsourcing', 0.035), ('policies', 0.035), ('teams', 0.035), ('incrementally', 0.034), ('levy', 0.033), ('zn', 0.033), ('signal', 0.033), ('apprenticeship', 0.033), ('bionlp', 0.033), ('blatz', 0.033), ('boddy', 0.033), ('buz', 0.033), ('cloud', 0.033), ('composer', 0.033), ('crossword', 0.033), ('cyan', 0.033), ('exhibition', 0.033), ('heights', 0.033), ('horsch', 0.033), ('lam', 0.033), ('loper', 0.033), ('lusitania', 0.033), ('maytal', 0.033), ('millionaire', 0.033), ('moldovan', 0.033), ('nltk', 0.033), ('nonzero', 0.033), ('protesting', 0.033), ('pujara', 0.033), ('rostamizadeh', 0.033), ('sinking', 0.033), ('starter', 0.033), ('syed', 0.033), ('tesauro', 0.033), ('thibadeau', 0.033), ('wait', 0.033)]
simIndex simValue paperId paperTitle
same-paper 1 1.0000001 23 emnlp-2012-Besting the Quiz Master: Crowdsourcing Incremental Classification Games
Author: Jordan Boyd-Graber ; Brianna Satinoff ; He He ; Hal Daume III
Abstract: Cost-sensitive classification, where thefeatures used in machine learning tasks have a cost, has been explored as a means of balancing knowledge against the expense of incrementally obtaining new features. We introduce a setting where humans engage in classification with incrementally revealed features: the collegiate trivia circuit. By providing the community with a web-based system to practice, we collected tens of thousands of implicit word-by-word ratings of how useful features are for eliciting correct answers. Observing humans’ classification process, we improve the performance of a state-of-the art classifier. We also use the dataset to evaluate a system to compete in the incremental classification task through a reduction of reinforcement learning to classification. Our system learns when to answer a question, performing better than baselines and most human players.
2 0.18918496 137 emnlp-2012-Why Question Answering using Sentiment Analysis and Word Classes
Author: Jong-Hoon Oh ; Kentaro Torisawa ; Chikara Hashimoto ; Takuya Kawada ; Stijn De Saeger ; Jun'ichi Kazama ; Yiou Wang
Abstract: In this paper we explore the utility of sentiment analysis and semantic word classes for improving why-question answering on a large-scale web corpus. Our work is motivated by the observation that a why-question and its answer often follow the pattern that if something undesirable happens, the reason is also often something undesirable, and if something desirable happens, the reason is also often something desirable. To the best of our knowledge, this is the first work that introduces sentiment analysis to non-factoid question answering. We combine this simple idea with semantic word classes for ranking answers to why-questions and show that on a set of 850 why-questions our method gains 15.2% improvement in precision at the top-1 answer over a baseline state-of-the-art QA system that achieved the best performance in a shared task of Japanese non-factoid QA in NTCIR-6.
3 0.17807719 20 emnlp-2012-Answering Opinion Questions on Products by Exploiting Hierarchical Organization of Consumer Reviews
Author: Jianxing Yu ; Zheng-Jun Zha ; Tat-Seng Chua
Abstract: This paper proposes to generate appropriate answers for opinion questions about products by exploiting the hierarchical organization of consumer reviews. The hierarchy organizes product aspects as nodes following their parent-child relations. For each aspect, the reviews and corresponding opinions on this aspect are stored. We develop a new framework for opinion Questions Answering, which enables accurate question analysis and effective answer generation by making use the hierarchy. In particular, we first identify the (explicit/implicit) product aspects asked in the questions and their sub-aspects by referring to the hierarchy. We then retrieve the corresponding review fragments relevant to the aspects from the hierarchy. In order to gener- ate appropriate answers from the review fragments, we develop a multi-criteria optimization approach for answer generation by simultaneously taking into account review salience, coherence, diversity, and parent-child relations among the aspects. We conduct evaluations on 11 popular products in four domains. The evaluated corpus contains 70,359 consumer reviews and 220 questions on these products. Experimental results demonstrate the effectiveness of our approach.
4 0.11926298 102 emnlp-2012-Optimising Incremental Dialogue Decisions Using Information Density for Interactive Systems
Author: Nina Dethlefs ; Helen Hastie ; Verena Rieser ; Oliver Lemon
Abstract: Incremental processing allows system designers to address several discourse phenomena that have previously been somewhat neglected in interactive systems, such as backchannels or barge-ins, but that can enhance the responsiveness and naturalness of systems. Unfortunately, prior work has focused largely on deterministic incremental decision making, rendering system behaviour less flexible and adaptive than is desirable. We present a novel approach to incremental decision making that is based on Hierarchical Reinforcement Learning to achieve an interactive optimisation of Information Presentation (IP) strategies, allowing the system to generate and comprehend backchannels and barge-ins, by employing the recent psycholinguistic hypothesis of information density (ID) (Jaeger, 2010). Results in terms of average rewards and a human rating study show that our learnt strategy outperforms several baselines that are | v not sensitive to ID by more than 23%.
5 0.098735042 97 emnlp-2012-Natural Language Questions for the Web of Data
Author: Mohamed Yahya ; Klaus Berberich ; Shady Elbassuoni ; Maya Ramanath ; Volker Tresp ; Gerhard Weikum
Abstract: The Linked Data initiative comprises structured databases in the Semantic-Web data model RDF. Exploring this heterogeneous data by structured query languages is tedious and error-prone even for skilled users. To ease the task, this paper presents a methodology for translating natural language questions into structured SPARQL queries over linked-data sources. Our method is based on an integer linear program to solve several disambiguation tasks jointly: the segmentation of questions into phrases; the mapping of phrases to semantic entities, classes, and relations; and the construction of SPARQL triple patterns. Our solution harnesses the rich type system provided by knowledge bases in the web of linked data, to constrain our semantic-coherence objective function. We present experiments on both the . in question translation and the resulting query answering.
6 0.076012067 41 emnlp-2012-Entity based QA Retrieval
7 0.068425268 56 emnlp-2012-Framework of Automatic Text Summarization Using Reinforcement Learning
8 0.059460789 114 emnlp-2012-Revisiting the Predictability of Language: Response Completion in Social Media
9 0.057379451 86 emnlp-2012-Locally Training the Log-Linear Model for SMT
10 0.056325994 94 emnlp-2012-Multiple Aspect Summarization Using Integer Linear Programming
11 0.05297469 29 emnlp-2012-Concurrent Acquisition of Word Meaning and Lexical Categories
12 0.052344475 107 emnlp-2012-Polarity Inducing Latent Semantic Analysis
13 0.049706604 84 emnlp-2012-Linking Named Entities to Any Database
14 0.047831953 120 emnlp-2012-Streaming Analysis of Discourse Participants
15 0.047105368 93 emnlp-2012-Multi-instance Multi-label Learning for Relation Extraction
16 0.044481263 77 emnlp-2012-Learning Constraints for Consistent Timeline Extraction
17 0.044447105 12 emnlp-2012-A Transition-Based System for Joint Part-of-Speech Tagging and Labeled Non-Projective Dependency Parsing
18 0.043695591 47 emnlp-2012-Explore Person Specific Evidence in Web Person Name Disambiguation
19 0.043231886 24 emnlp-2012-Biased Representation Learning for Domain Adaptation
20 0.043108463 124 emnlp-2012-Three Dependency-and-Boundary Models for Grammar Induction
topicId topicWeight
[(0, 0.195), (1, 0.082), (2, 0.014), (3, 0.141), (4, 0.021), (5, -0.082), (6, -0.047), (7, -0.004), (8, -0.003), (9, 0.048), (10, 0.15), (11, -0.025), (12, -0.21), (13, -0.135), (14, 0.089), (15, -0.023), (16, 0.115), (17, 0.096), (18, 0.003), (19, -0.068), (20, 0.238), (21, 0.03), (22, -0.027), (23, 0.258), (24, 0.19), (25, -0.09), (26, 0.127), (27, 0.097), (28, 0.037), (29, 0.146), (30, 0.048), (31, 0.028), (32, 0.125), (33, 0.027), (34, -0.164), (35, -0.004), (36, 0.07), (37, 0.104), (38, 0.009), (39, 0.02), (40, -0.01), (41, 0.044), (42, 0.049), (43, -0.047), (44, -0.084), (45, -0.01), (46, 0.023), (47, 0.014), (48, -0.031), (49, -0.019)]
simIndex simValue paperId paperTitle
same-paper 1 0.96343338 23 emnlp-2012-Besting the Quiz Master: Crowdsourcing Incremental Classification Games
Author: Jordan Boyd-Graber ; Brianna Satinoff ; He He ; Hal Daume III
Abstract: Cost-sensitive classification, where thefeatures used in machine learning tasks have a cost, has been explored as a means of balancing knowledge against the expense of incrementally obtaining new features. We introduce a setting where humans engage in classification with incrementally revealed features: the collegiate trivia circuit. By providing the community with a web-based system to practice, we collected tens of thousands of implicit word-by-word ratings of how useful features are for eliciting correct answers. Observing humans’ classification process, we improve the performance of a state-of-the art classifier. We also use the dataset to evaluate a system to compete in the incremental classification task through a reduction of reinforcement learning to classification. Our system learns when to answer a question, performing better than baselines and most human players.
2 0.74153513 137 emnlp-2012-Why Question Answering using Sentiment Analysis and Word Classes
Author: Jong-Hoon Oh ; Kentaro Torisawa ; Chikara Hashimoto ; Takuya Kawada ; Stijn De Saeger ; Jun'ichi Kazama ; Yiou Wang
Abstract: In this paper we explore the utility of sentiment analysis and semantic word classes for improving why-question answering on a large-scale web corpus. Our work is motivated by the observation that a why-question and its answer often follow the pattern that if something undesirable happens, the reason is also often something undesirable, and if something desirable happens, the reason is also often something desirable. To the best of our knowledge, this is the first work that introduces sentiment analysis to non-factoid question answering. We combine this simple idea with semantic word classes for ranking answers to why-questions and show that on a set of 850 why-questions our method gains 15.2% improvement in precision at the top-1 answer over a baseline state-of-the-art QA system that achieved the best performance in a shared task of Japanese non-factoid QA in NTCIR-6.
3 0.68323278 20 emnlp-2012-Answering Opinion Questions on Products by Exploiting Hierarchical Organization of Consumer Reviews
Author: Jianxing Yu ; Zheng-Jun Zha ; Tat-Seng Chua
Abstract: This paper proposes to generate appropriate answers for opinion questions about products by exploiting the hierarchical organization of consumer reviews. The hierarchy organizes product aspects as nodes following their parent-child relations. For each aspect, the reviews and corresponding opinions on this aspect are stored. We develop a new framework for opinion Questions Answering, which enables accurate question analysis and effective answer generation by making use the hierarchy. In particular, we first identify the (explicit/implicit) product aspects asked in the questions and their sub-aspects by referring to the hierarchy. We then retrieve the corresponding review fragments relevant to the aspects from the hierarchy. In order to gener- ate appropriate answers from the review fragments, we develop a multi-criteria optimization approach for answer generation by simultaneously taking into account review salience, coherence, diversity, and parent-child relations among the aspects. We conduct evaluations on 11 popular products in four domains. The evaluated corpus contains 70,359 consumer reviews and 220 questions on these products. Experimental results demonstrate the effectiveness of our approach.
4 0.43908459 97 emnlp-2012-Natural Language Questions for the Web of Data
Author: Mohamed Yahya ; Klaus Berberich ; Shady Elbassuoni ; Maya Ramanath ; Volker Tresp ; Gerhard Weikum
Abstract: The Linked Data initiative comprises structured databases in the Semantic-Web data model RDF. Exploring this heterogeneous data by structured query languages is tedious and error-prone even for skilled users. To ease the task, this paper presents a methodology for translating natural language questions into structured SPARQL queries over linked-data sources. Our method is based on an integer linear program to solve several disambiguation tasks jointly: the segmentation of questions into phrases; the mapping of phrases to semantic entities, classes, and relations; and the construction of SPARQL triple patterns. Our solution harnesses the rich type system provided by knowledge bases in the web of linked data, to constrain our semantic-coherence objective function. We present experiments on both the . in question translation and the resulting query answering.
5 0.42960542 41 emnlp-2012-Entity based QA Retrieval
Author: Amit Singh
Abstract: Bridging the lexical gap between the user’s question and the question-answer pairs in the Q&A; archives has been a major challenge for Q&A; retrieval. State-of-the-art approaches address this issue by implicitly expanding the queries with additional words using statistical translation models. While useful, the effectiveness of these models is highly dependant on the availability of quality corpus in the absence of which they are troubled by noise issues. Moreover these models perform word based expansion in a context agnostic manner resulting in translation that might be mixed and fairly general. This results in degraded retrieval performance. In this work we address the above issues by extending the lexical word based translation model to incorporate semantic concepts (entities). We explore strategies to learn the translation probabilities between words and the concepts using the Q&A; archives and a popular entity catalog. Experiments conducted on a large scale real data show that the proposed techniques are promising.
6 0.36387366 102 emnlp-2012-Optimising Incremental Dialogue Decisions Using Information Density for Interactive Systems
7 0.34329775 107 emnlp-2012-Polarity Inducing Latent Semantic Analysis
8 0.3020556 29 emnlp-2012-Concurrent Acquisition of Word Meaning and Lexical Categories
9 0.27430677 86 emnlp-2012-Locally Training the Log-Linear Model for SMT
10 0.24386898 79 emnlp-2012-Learning Syntactic Categories Using Paradigmatic Representations of Word Context
11 0.23777363 56 emnlp-2012-Framework of Automatic Text Summarization Using Reinforcement Learning
12 0.23767392 77 emnlp-2012-Learning Constraints for Consistent Timeline Extraction
13 0.22510286 114 emnlp-2012-Revisiting the Predictability of Language: Response Completion in Social Media
14 0.22149755 60 emnlp-2012-Generative Goal-Driven User Simulation for Dialog Management
15 0.21966128 94 emnlp-2012-Multiple Aspect Summarization Using Integer Linear Programming
16 0.21457389 89 emnlp-2012-Mixed Membership Markov Models for Unsupervised Conversation Modeling
17 0.20727439 120 emnlp-2012-Streaming Analysis of Discourse Participants
18 0.20677783 122 emnlp-2012-Syntactic Surprisal Affects Spoken Word Duration in Conversational Contexts
19 0.20296359 10 emnlp-2012-A Statistical Relational Learning Approach to Identifying Evidence Based Medicine Categories
20 0.20057067 115 emnlp-2012-SSHLDA: A Semi-Supervised Hierarchical Topic Model
topicId topicWeight
[(2, 0.033), (16, 0.049), (25, 0.014), (34, 0.069), (45, 0.016), (59, 0.26), (60, 0.109), (63, 0.076), (64, 0.021), (65, 0.025), (70, 0.017), (73, 0.035), (74, 0.039), (76, 0.053), (80, 0.015), (86, 0.055), (95, 0.021)]
simIndex simValue paperId paperTitle
1 0.8194328 51 emnlp-2012-Extracting Opinion Expressions with semi-Markov Conditional Random Fields
Author: Bishan Yang ; Claire Cardie
Abstract: Extracting opinion expressions from text is usually formulated as a token-level sequence labeling task tackled using Conditional Random Fields (CRFs). CRFs, however, do not readily model potentially useful segment-level information like syntactic constituent structure. Thus, we propose a semi-CRF-based approach to the task that can perform sequence labeling at the segment level. We extend the original semi-CRF model (Sarawagi and Cohen, 2004) to allow the modeling of arbitrarily long expressions while accounting for their likely syntactic structure when modeling segment boundaries. We evaluate performance on two opinion extraction tasks, and, in contrast to previous sequence labeling approaches to the task, explore the usefulness of segment- level syntactic parse features. Experimental results demonstrate that our approach outperforms state-of-the-art methods for both opinion expression tasks.
same-paper 2 0.7876513 23 emnlp-2012-Besting the Quiz Master: Crowdsourcing Incremental Classification Games
Author: Jordan Boyd-Graber ; Brianna Satinoff ; He He ; Hal Daume III
Abstract: Cost-sensitive classification, where thefeatures used in machine learning tasks have a cost, has been explored as a means of balancing knowledge against the expense of incrementally obtaining new features. We introduce a setting where humans engage in classification with incrementally revealed features: the collegiate trivia circuit. By providing the community with a web-based system to practice, we collected tens of thousands of implicit word-by-word ratings of how useful features are for eliciting correct answers. Observing humans’ classification process, we improve the performance of a state-of-the art classifier. We also use the dataset to evaluate a system to compete in the incremental classification task through a reduction of reinforcement learning to classification. Our system learns when to answer a question, performing better than baselines and most human players.
3 0.56256902 71 emnlp-2012-Joint Entity and Event Coreference Resolution across Documents
Author: Heeyoung Lee ; Marta Recasens ; Angel Chang ; Mihai Surdeanu ; Dan Jurafsky
Abstract: We introduce a novel coreference resolution system that models entities and events jointly. Our iterative method cautiously constructs clusters of entity and event mentions using linear regression to model cluster merge operations. As clusters are built, information flows between entity and event clusters through features that model semantic role dependencies. Our system handles nominal and verbal events as well as entities, and our joint formulation allows information from event coreference to help entity coreference, and vice versa. In a cross-document domain with comparable documents, joint coreference resolution performs significantly better (over 3 CoNLL F1 points) than two strong baselines that resolve entities and events separately.
4 0.54901081 20 emnlp-2012-Answering Opinion Questions on Products by Exploiting Hierarchical Organization of Consumer Reviews
Author: Jianxing Yu ; Zheng-Jun Zha ; Tat-Seng Chua
Abstract: This paper proposes to generate appropriate answers for opinion questions about products by exploiting the hierarchical organization of consumer reviews. The hierarchy organizes product aspects as nodes following their parent-child relations. For each aspect, the reviews and corresponding opinions on this aspect are stored. We develop a new framework for opinion Questions Answering, which enables accurate question analysis and effective answer generation by making use the hierarchy. In particular, we first identify the (explicit/implicit) product aspects asked in the questions and their sub-aspects by referring to the hierarchy. We then retrieve the corresponding review fragments relevant to the aspects from the hierarchy. In order to gener- ate appropriate answers from the review fragments, we develop a multi-criteria optimization approach for answer generation by simultaneously taking into account review salience, coherence, diversity, and parent-child relations among the aspects. We conduct evaluations on 11 popular products in four domains. The evaluated corpus contains 70,359 consumer reviews and 220 questions on these products. Experimental results demonstrate the effectiveness of our approach.
5 0.54297793 124 emnlp-2012-Three Dependency-and-Boundary Models for Grammar Induction
Author: Valentin I. Spitkovsky ; Hiyan Alshawi ; Daniel Jurafsky
Abstract: We present a new family of models for unsupervised parsing, Dependency and Boundary models, that use cues at constituent boundaries to inform head-outward dependency tree generation. We build on three intuitions that are explicit in phrase-structure grammars but only implicit in standard dependency formulations: (i) Distributions of words that occur at sentence boundaries such as English determiners resemble constituent edges. (ii) Punctuation at sentence boundaries further helps distinguish full sentences from fragments like headlines and titles, allowing us to model grammatical differences between complete and incomplete sentences. (iii) Sentence-internal punctuation boundaries help with longer-distance dependencies, since punctuation correlates with constituent edges. Our models induce state-of-the-art dependency grammars for many languages without — — special knowledge of optimal input sentence lengths or biased, manually-tuned initializers.
6 0.53905636 136 emnlp-2012-Weakly Supervised Training of Semantic Parsers
7 0.53622109 14 emnlp-2012-A Weakly Supervised Model for Sentence-Level Semantic Orientation Analysis with Multiple Experts
8 0.53538483 82 emnlp-2012-Left-to-Right Tree-to-String Decoding with Prediction
9 0.53388977 3 emnlp-2012-A Coherence Model Based on Syntactic Patterns
10 0.53257191 123 emnlp-2012-Syntactic Transfer Using a Bilingual Lexicon
11 0.5310477 110 emnlp-2012-Reading The Web with Learned Syntactic-Semantic Inference Rules
12 0.5296343 89 emnlp-2012-Mixed Membership Markov Models for Unsupervised Conversation Modeling
13 0.52955502 92 emnlp-2012-Multi-Domain Learning: When Do Domains Matter?
14 0.5281989 18 emnlp-2012-An Empirical Investigation of Statistical Significance in NLP
15 0.52806103 114 emnlp-2012-Revisiting the Predictability of Language: Response Completion in Social Media
16 0.52771443 135 emnlp-2012-Using Discourse Information for Paraphrase Extraction
17 0.52738661 42 emnlp-2012-Entropy-based Pruning for Phrase-based Machine Translation
18 0.52613401 93 emnlp-2012-Multi-instance Multi-label Learning for Relation Extraction
19 0.52560818 12 emnlp-2012-A Transition-Based System for Joint Part-of-Speech Tagging and Labeled Non-Projective Dependency Parsing
20 0.52540797 107 emnlp-2012-Polarity Inducing Latent Semantic Analysis