emnlp emnlp2011 emnlp2011-106 knowledge-graph by maker-knowledge-mining
Source: pdf
Author: Dani Yogatama ; Michael Heilman ; Brendan O'Connor ; Chris Dyer ; Bryan R. Routledge ; Noah A. Smith
Abstract: We consider the problem of predicting measurable responses to scientific articles based primarily on their text content. Specifically, we consider papers in two fields (economics and computational linguistics) and make predictions about downloads and within-community citations. Our approach is based on generalized linear models, allowing interpretability; a novel extension that captures first-order temporal effects is also presented. We demonstrate that text features significantly improve accuracy of predictions over metadata features like authors, topical categories, and publication venues.
Reference: text
sentIndex sentText sentNum sentScore
1 Specifically, we consider papers in two fields (economics and computational linguistics) and make predictions about downloads and within-community citations. [sent-6, score-0.666]
2 There are many measures of impact of a scientific paper; ours come from direct measurements of the number of downloads (from an established website where prominent economists post papers before formal publication) and citations (within a fixed scientific community). [sent-22, score-1.039]
3 We adopt a discriminative approach based on generalized linear models that can make use of any text or metadata features, and show that simple lexical features offer substantial power in modeling out-ofsample response and in forecasting response for future articles. [sent-23, score-0.602]
4 Our time series-inspired regularizer is computationally efficient in learning and is a significant advance over earlier text-driven forecasting models that ignore the time variable altogether (Kogan et al. [sent-27, score-0.474]
5 We evaluate our approaches in two novel experimental settings: predicting downloads of economics articles and predicting citation of papers at ACL conferences. [sent-30, score-1.04]
6 sc0# citations18 Figure 1: Left: the distribution of log download counts for papers in the NBER dataset one year after posting. [sent-35, score-0.679]
7 Right: the distribution of within-dataset citations of ACL papers within three years of publication (outliers excluded for readability). [sent-36, score-0.601]
8 1 NBER Our first dataset consists of research papers in economics from the National Bureau of Economic Research (NBER) from 1999 to 2009 (http : / / www . [sent-42, score-0.387]
9 The papers are not yet peer-reviewed, but given the prominence of many economists affiliated with the NBER, many of the papers are widely read. [sent-47, score-0.623]
10 Text from the abstracts of the papers and related metadata are publicly available. [sent-48, score-0.385]
11 For each paper, we computed the total number of downloads in the first year after each paper’s The download counts are log-normally distributed, as shown in Figure 1, and so our regression models (§3) minimize squared errors uinr tehger log space. [sent-51, score-0.956]
12 1 1For the vast majority of papers, most of the downloads occur soon after the paper’s posting. [sent-53, score-0.354]
13 We leave a more detailed analysis of the time series patterns of downloads to future work. [sent-55, score-0.6]
14 We use the 8,814 papers from 1999–2009 period (there are 16,334 papers in the full dataset dating back to 1985). [sent-61, score-0.649]
15 2 ACL Our second dataset consists of research papers from the Association for Computational Linguistics (ACL) from 1980 to 2006 (Radev et al. [sent-64, score-0.323]
16 We have the full texts for papers (OCR output) as well as structured citation data. [sent-67, score-0.544]
17 For the citation prediction task, we include conference papers from ACL, EACL, HLT, and NAACL. [sent-69, score-0.543]
18 We do include short papers, interactive demo session papers, and student research papers that are included in the companion volumes for these conferences (such papers are cited less than full papers, but many are still cited). [sent-71, score-0.727]
19 The number of papers in each year varies because not all conferences are annual. [sent-73, score-0.522]
20 We look at citations in the three-year window following publication, excluding self-citations and only considering citations from papers within these con- ferences. [sent-74, score-0.687]
21 Figure 1 shows a histogram; note that many papers (54%) are not cited at all, and the distribution of citations per paper is neither normal nor log-normal. [sent-75, score-0.598]
22 We organize the papers into two classes: those with zero citations and those with non-zero citations in the three-year window. [sent-76, score-0.687]
23 3 Model Our forecasting approach is based on generalized linear models for regression and classification. [sent-78, score-0.502]
24 3 For the NBER data, where (log) number of downloads is nearly a continuous measure, we use linear regression. [sent-80, score-0.354]
25 Then, we describe a time series model appropriate for time series data. [sent-84, score-0.492]
26 For the NBER data, the (log) number of downloads is continuous, and so we use least-squares linear regression model. [sent-93, score-0.542]
27 In ridge regression (Hoerl and Kennard, 1970), a standard method to which we compare the time series regularization discussed in §3. [sent-120, score-0.787]
28 The ridge linear regression model can be interpreted probabilistically as each coefficient βj is drawn i. [sent-124, score-0.506]
29 In this work, we apply time series regularization to GLMs, enabling models that have coefficients that change over time but prefer gradual changes across time steps. [sent-140, score-0.564]
30 The time series regularization penalty becomes: XT Xd XT Xd R(β) = λXXβ2t,j+λαXX(βt,j − βt−1,j)2 Xt=1 jX= X1 Xt=2 Xj=0 It includes a standard ‘2-penalty on the coefficients, and a penalty for differences between coefficients for adjacent time steps to induce smooth changes. [sent-144, score-0.579]
31 Setting α to zero imposes no penalty for time-variation in the coefficients and results in independent ridge regressions at each time step. [sent-147, score-0.446]
32 Also, when the number of examples is constant across time steps, setting a large α parameter (α → ∞) results in a single ridge regression over all years s ∞in)ce r eits imposes βt,j = βt+1,j gforers asil o tn ∈ eTr. [sent-148, score-0.551]
33 The partial derivative is: ∂R/∂βt,j = 2λβt,j + 1{t > 1}2λα(βt,j − βt−1,j) + 1{t < T}2λα(βt,j − βt+1,j) This time series regularization can be applied more generally, not just to linear and logistic regression. [sent-149, score-0.397]
34 With either ridge regularization or this time series regularization scheme, Eq. [sent-150, score-0.711]
35 1is an unconstrained convex optimization problem for the linear models 5Our implementation of the time series regularizer does not penalize the magnitude of the weight for the bias feature (as in ridge regression). [sent-151, score-0.588]
36 XTYβT Figure 2: Time series regression as a graphical model; the variables Xt and Yt are the sets of feature vectors and response variables from documents dated t. [sent-154, score-0.452]
37 Probabilistic Interpretation We can interpret the time series regularization probabilistically as follows. [sent-157, score-0.384]
38 6 Figure 2 shows a graphical representation of the time series regularization in our model. [sent-171, score-0.358]
39 7Almost all NBER papers are tagged with one or more programs (we assign untagged papers a “null” tag). [sent-206, score-0.581]
40 598 5 Experiments For each of the datasets in §2, we test our models fFoorr tw eaoc htas okfs t:h forecasting nab §o2u,t w feut tuerset papers (i. [sent-212, score-0.59]
41 , making predictions about papers that appeared after a training dataset) and modeling held-out papers from the past (i. [sent-214, score-0.659]
42 For the NBER dataset, the task is to predict the number of downloads a paper will receive in its first year after publication. [sent-217, score-0.569]
43 To our knowledge, clean, reliable citation counts are not available for the NBER dataset; nor are download statistics available for the ACL dataset. [sent-219, score-0.354]
44 1 Extrapolation The lag between a paper’s publication and when its outcome (download or citation count) can be observed poses a unique methodological challenge. [sent-222, score-0.293]
45 Consider predicting the number of downloads over g future time steps. [sent-223, score-0.461]
46 To extrapolate its number ofdownloads, we c tonsider the observed number in [t0, t] , and then estimate the ratio r of downloads that occur in the first t−t0 time steps, against the first g time steps, − − using sthtte− fully observed portion of the training data. [sent-229, score-0.487]
47 We then scale the observed downloads during [t0, t] by r−1 to extrapolate. [sent-230, score-0.354]
48 In preliminary experiments, we observed that extrapolating responses for papers in the forecast gap led to better performance in general. [sent-232, score-0.478]
49 For example, for the ridge regressions trained on all past years with the full feature set, the error dropped from 262 to 259 when using extrapolation compared to without extrapolation. [sent-233, score-0.494]
50 2 Forecasting NBER Downloads In our first set ofexperiments, we predict the number of downloads of an NBER paper within one year of its publication. [sent-236, score-0.569]
51 The second and third use GLMs with ridge regression-style regularization (§3. [sent-239, score-0.353]
52 2), trained on all past years (“all years”) and on t(§he3 single most recent past year (“one year”), respectively. [sent-240, score-0.359]
53 The last model (“time series”) is a GLM with time series regularization (§3. [sent-241, score-0.358]
54 We held out a random 20% of papers for each year from 1999–2007 as a test set for the task of modeling the past. [sent-245, score-0.524]
55 To define the feature set and tune hyperparameters, we used the remaining 80% of papers from 1999–2005 as our training data and the remaining papers in 2006 as our development data. [sent-246, score-0.55]
56 When tuning hyperparameters, we simulated the existence of a forecast gap by using extrapolated responses for papers in the last year of the training data instead of their true responses. [sent-248, score-0.769]
57 We then used the selected feature set and hyperparameters to test the forecasting and modeling capa- bilities of each model. [sent-256, score-0.299]
58 For forecasting, we predicted numbers of downloads of papers in 2008 and 2009. [sent-257, score-0.689]
59 We used the baseline median, ridge regression, and time series regularization models trained on papers in 1999–2007 and 1999–2008, respectively. [sent-258, score-0.9]
60 2008, respectively) as a forecast gap, since we would not have observed complete responses of papers in these years when forecasting. [sent-289, score-0.508]
61 10 10Papers from the most recent past year in a training set have incomplete responses, so the models were trained on extrapo- lated responses for that year. [sent-291, score-0.356]
62 For the NBER development set from 2005, a ridge regression on just 2004 papers (for which extrapolation is needed) outperformed a regression on just 2003 (for which extrapolation is not needed), 278 to 367 mean absolute error. [sent-292, score-1.044]
63 rie “s†” ”a inndd ridge regression feature set (Wilcoxon signed-rank significance between fseigantuifriecsa nacned b tehtew feuelnl significance between msigodneifilsc using tehtew feuelnl test, p < 0. [sent-299, score-0.545]
64 To evaluate the modeling capabilities, we trained the ridge regression and time series regularization models on papers from 1999–2008 and predicted the numbers of downloads of held-out papers in 1999– 2007. [sent-302, score-1.811]
65 For comparison, we also trained ridge regression models on each individual year (“one year”) and predicted the numbers of downloads of the heldout papers in the corresponding year. [sent-303, score-1.359]
66 Table 3 shows mean absolute errors for each method on both forecasting test splits, and mean absolute errors averaged across papers over nine modeling test splits. [sent-304, score-0.574]
67 While the time series model did not significantly outperform ridge regression at predicting future downloads, it did result in significantly better performance for modeling papers in the past. [sent-308, score-1.037]
68 3 Forecasting ACL Citations We now turn to the problem of predicting citation levels. [sent-310, score-0.294]
69 Our experimental setup (Figure 3) is similar to the setup for the NBER dataset, except that we use logistic regression to model the discrete cited-or-not response variable. [sent-312, score-0.299]
70 ModelM19o8d0e–l0in3g2004Fore2c0a0s5ting2006 TaM bFle–ult a 4:tioma Cnlae yjaosey reasirtfsyericaton6 57a19650curay∗65(7%9860 )fo∗r765 034 predi∗c765 t0762ing whether ACL papers will be cited within three years. [sent-315, score-0.392]
71 With the full feature set, differences between the time series and ridge (all years) models are not statistically significant at the 0. [sent-319, score-0.541]
72 Again, we compare four methods: a baseline of always predicting the most frequent class in the training data, “all and “one logistic regression models, and a logistic regression with the time series regularizer. [sent-326, score-0.753]
73 For the forecasting task, we used papers in 2004, 2005, and 2006 as test sets. [sent-327, score-0.54]
74 As the training sets for the “all and time series models, we used papers from 1980 up to the last year before each test set, with the last two years extrapolated. [sent-328, score-0.804]
75 As the training sets for the “one models, we used pa- years” year” years” year” pers from the year immediately before the test set, with extrapolated responses. [sent-329, score-0.291]
76 To evaluate modeling capabilities, we predicted citation levels of held-out papers in 1980–2003. [sent-330, score-0.576]
77 We trained “one year” models separately for each year and predicted downloads for the held-out papers in that year. [sent-332, score-0.896]
78 Table 4 shows classification accuracy for each model on the test data for both the forecasting and modeling tasks. [sent-333, score-0.299]
79 Also, the time series regression model shows a small, though not statistically significant, gain for modeling whether past papers will be cited—as well as similarly small gains on two of the three forecasting test years. [sent-335, score-1.046]
80 4 Ranking We can also use the models for ranking to help decide which papers are expected to have the greatest impact. [sent-337, score-0.301]
81 With rankings, we can use the same metric both for download and citation predictions. [sent-338, score-0.354]
82 For the NBER data, we ranked test-set papers based on the predicted numbers of downloads and computed the correlation to the actual numbers of downloads. [sent-339, score-0.723]
83 For the ACL data, we ranked papers based on the probability of being cited (within the next three years) and computed the correlation to the actual numbers of citations. [sent-340, score-0.426]
84 Here, the items are scientific papers and the two metrics are the gold standard numbers of downloads (or citations) and model predictions for the numbers of downloads, or citation probabilities. [sent-342, score-1.055]
85 As in the previous experiments, we see small benefits for the time series regression model on most held-out data splits— and larger benefits for including text features along with metadata features. [sent-350, score-0.544]
86 6 Analysis An advantage of the time series regularized regression model is its interpretability. [sent-351, score-0.434]
87 11Here, we use models of responses to individual papers for ranking (i. [sent-353, score-0.378]
88 Time series regularization could also be applied to ranking models that model pairwise preferences to optimize metrics like Kendall’s τ directly, as discussed by Joachims (2002). [sent-356, score-0.33]
89 The yearto-year weights of “one year” models fluctuate substantially, and the “all years” model is necessarily constant, but the time series regularizer gives a smooth trajectory. [sent-368, score-0.347]
90 Figure 5 illustrates the βt,j trends in the ACL time series model for some selected terms that oc12http : //ngrams . [sent-376, score-0.3]
91 13 The effect is present but relatively small according to our model: the total number of papers co-authored by an author has a weak corre- lation to the author’s citation prediction coefficient (τ = 0. [sent-399, score-0.676]
92 Since we did not prune author features, there are many authors with 13More precisely: if a prolific author and a non-prolific author write a paper, does the prolific author’s paper have a higher probability of being cited than the non-prolific author’s, all other things being equal? [sent-403, score-0.477]
93 Every point is one ACL author, and the vertical axis shows the citation coefficient, compared to (a) the number of documents co-authored by the author; and (b) the proportion of an author’s papers that are cited within three years. [sent-417, score-0.657]
94 The semantics of the regression imply we are measuring the relative citation probability of an author, controlling for text and venue effects. [sent-424, score-0.454]
95 If an author has a high citation prediction coefficient but a low citation probability, that implies the author has better-cited work than would be expected according to the n-grams in his or her papers. [sent-425, score-0.724]
96 7 Related Work Previous work on modeling scientific literature mostly focused on citation graphs (Borner et al. [sent-428, score-0.355]
97 While the forecasts in those papers are similar to ours, those au- thors did not consider a forecast gap or allowing the parameters of the model to vary over time. [sent-444, score-0.401]
98 Our time series regularization is closely related to the fused lasso (Tibshirani et al. [sent-445, score-0.383]
99 To improve the interpretability ofthe linear model, we developed a novel time series regularizer that encourages gradual changes across time steps. [sent-451, score-0.409]
100 Our experiments showed that text features significantly improve accuracy of predictions over baseline models, and we found that the feature weights learned with the time series regularizer reflect important trends in the literature. [sent-452, score-0.412]
wordName wordTfidf (topN-words)
[('nber', 0.413), ('downloads', 0.354), ('papers', 0.275), ('forecasting', 0.265), ('ridge', 0.241), ('citation', 0.241), ('year', 0.215), ('citations', 0.206), ('series', 0.192), ('regression', 0.188), ('cited', 0.117), ('download', 0.113), ('regularization', 0.112), ('metadata', 0.11), ('forecast', 0.088), ('author', 0.082), ('scientific', 0.08), ('responses', 0.077), ('extrapolation', 0.076), ('extrapolated', 0.076), ('regularizer', 0.075), ('coefficients', 0.072), ('response', 0.072), ('years', 0.068), ('economics', 0.064), ('scholarly', 0.063), ('meta', 0.059), ('unused', 0.059), ('trends', 0.054), ('time', 0.054), ('radev', 0.054), ('predicting', 0.053), ('publication', 0.052), ('coefficient', 0.051), ('authors', 0.05), ('dataset', 0.048), ('borner', 0.044), ('economists', 0.044), ('hoerl', 0.044), ('kogan', 0.044), ('regressions', 0.043), ('logistic', 0.039), ('mccullagh', 0.038), ('qazvinian', 0.038), ('past', 0.038), ('gap', 0.038), ('predictions', 0.037), ('penalty', 0.036), ('xt', 0.035), ('interpretability', 0.034), ('universities', 0.034), ('acl', 0.034), ('modeling', 0.034), ('numbers', 0.034), ('blei', 0.033), ('squared', 0.032), ('prolific', 0.032), ('conferences', 0.032), ('programs', 0.031), ('gerrish', 0.03), ('affiliated', 0.029), ('anthology', 0.029), ('bethard', 0.029), ('brank', 0.029), ('bureau', 0.029), ('erosheva', 0.029), ('feuelnl', 0.029), ('gaptest', 0.029), ('glms', 0.029), ('kennard', 0.029), ('mcgovern', 0.029), ('mtraoindienligng', 0.029), ('newp', 0.029), ('pjd', 0.029), ('revenues', 0.029), ('tehtew', 0.029), ('testgaptest', 0.029), ('pittsburgh', 0.029), ('tibshirani', 0.028), ('median', 0.028), ('ramage', 0.028), ('frequencies', 0.028), ('full', 0.028), ('log', 0.028), ('prediction', 0.027), ('probabilistically', 0.026), ('models', 0.026), ('predicted', 0.026), ('fused', 0.025), ('cameron', 0.025), ('venue', 0.025), ('extrapolate', 0.025), ('loss', 0.024), ('vertical', 0.024), ('joshi', 0.024), ('tw', 0.024), ('steps', 0.023), ('generalized', 0.023), ('published', 0.023), ('period', 0.023)]
simIndex simValue paperId paperTitle
same-paper 1 1.0000005 106 emnlp-2011-Predicting a Scientific Communitys Response to an Article
Author: Dani Yogatama ; Michael Heilman ; Brendan O'Connor ; Chris Dyer ; Bryan R. Routledge ; Noah A. Smith
Abstract: We consider the problem of predicting measurable responses to scientific articles based primarily on their text content. Specifically, we consider papers in two fields (economics and computational linguistics) and make predictions about downloads and within-community citations. Our approach is based on generalized linear models, allowing interpretability; a novel extension that captures first-order temporal effects is also presented. We demonstrate that text features significantly improve accuracy of predictions over metadata features like authors, topical categories, and publication venues.
2 0.086997129 38 emnlp-2011-Data-Driven Response Generation in Social Media
Author: Alan Ritter ; Colin Cherry ; William B. Dolan
Abstract: Ottawa, Ontario, K1A 0R6 Co l . Cherry@ nrc-cnrc . gc . ca in Redmond, WA 98052 bi l ldol @mi cro so ft . com large corpus of status-response pairs found on Twitter to create a system that responds to Twitter status We present a data-driven approach to generating responses to Twitter status posts, based on phrase-based Statistical Machine Translation. We find that mapping conversational stimuli onto responses is more difficult than translating between languages, due to the wider range of possible responses, the larger fraction of unaligned words/phrases, and the presence of large phrase pairs whose alignment cannot be further decomposed. After addressing these challenges, we compare approaches based on SMT and Information Retrieval in a human evaluation. We show that SMT outperforms IR on this task, and its output is preferred over actual human responses in 15% of cases. As far as we are aware, this is the first work to investigate the use of phrase-based SMT to directly translate a linguistic stimulus into an appropriate response.
3 0.081720822 129 emnlp-2011-Structured Sparsity in Structured Prediction
Author: Andre Martins ; Noah Smith ; Mario Figueiredo ; Pedro Aguiar
Abstract: Linear models have enjoyed great success in structured prediction in NLP. While a lot of progress has been made on efficient training with several loss functions, the problem of endowing learners with a mechanism for feature selection is still unsolved. Common approaches employ ad hoc filtering or L1regularization; both ignore the structure of the feature space, preventing practicioners from encoding structural prior knowledge. We fill this gap by adopting regularizers that promote structured sparsity, along with efficient algorithms to handle them. Experiments on three tasks (chunking, entity recognition, and dependency parsing) show gains in performance, compactness, and model interpretability.
4 0.055178758 104 emnlp-2011-Personalized Recommendation of User Comments via Factor Models
Author: Deepak Agarwal ; Bee-Chung Chen ; Bo Pang
Abstract: In recent years, the amount of user-generated opinionated texts (e.g., reviews, user comments) continues to grow at a rapid speed: featured news stories on a major event easily attract thousands of user comments on a popular online News service. How to consume subjective information ofthis volume becomes an interesting and important research question. In contrast to previous work on review analysis that tried to filter or summarize information for a generic average user, we explore a different direction of enabling personalized recommendation of such information. For each user, our task is to rank the comments associated with a given article according to personalized user preference (i.e., whether the user is likely to like or dislike the comment). To this end, we propose a factor model that incorporates rater-comment and rater-author interactions simultaneously in a principled way. Our full model significantly outperforms strong baselines as well as related models that have been considered in previous work.
5 0.043301456 21 emnlp-2011-Bayesian Checking for Topic Models
Author: David Mimno ; David Blei
Abstract: Real document collections do not fit the independence assumptions asserted by most statistical topic models, but how badly do they violate them? We present a Bayesian method for measuring how well a topic model fits a corpus. Our approach is based on posterior predictive checking, a method for diagnosing Bayesian models in user-defined ways. Our method can identify where a topic model fits the data, where it falls short, and in which directions it might be improved.
6 0.040870763 12 emnlp-2011-A Weakly-supervised Approach to Argumentative Zoning of Scientific Documents
7 0.036869854 101 emnlp-2011-Optimizing Semantic Coherence in Topic Models
8 0.035065673 41 emnlp-2011-Discriminating Gender on Twitter
9 0.035026003 119 emnlp-2011-Semantic Topic Models: Combining Word Distributional Statistics and Dictionary Definitions
10 0.034449525 100 emnlp-2011-Optimal Search for Minimum Error Rate Training
11 0.034224425 135 emnlp-2011-Timeline Generation through Evolutionary Trans-Temporal Summarization
12 0.033911642 125 emnlp-2011-Statistical Machine Translation with Local Language Models
13 0.033587813 137 emnlp-2011-Training dependency parsers by jointly optimizing multiple objectives
14 0.033363119 30 emnlp-2011-Compositional Matrix-Space Models for Sentiment Analysis
15 0.033314515 28 emnlp-2011-Closing the Loop: Fast, Interactive Semi-Supervised Annotation With Queries on Features and Instances
16 0.03297852 8 emnlp-2011-A Model of Discourse Predictions in Human Sentence Processing
17 0.030322105 146 emnlp-2011-Unsupervised Structure Prediction with Non-Parallel Multilingual Guidance
18 0.030238161 110 emnlp-2011-Ranking Human and Machine Summarization Systems
19 0.029662196 120 emnlp-2011-Semi-Supervised Recursive Autoencoders for Predicting Sentiment Distributions
20 0.029583901 93 emnlp-2011-Minimum Imputed-Risk: Unsupervised Discriminative Training for Machine Translation
topicId topicWeight
[(0, 0.119), (1, -0.047), (2, -0.001), (3, -0.045), (4, 0.011), (5, 0.043), (6, -0.012), (7, -0.052), (8, -0.04), (9, -0.017), (10, -0.041), (11, -0.11), (12, -0.038), (13, -0.019), (14, -0.018), (15, 0.023), (16, 0.012), (17, -0.005), (18, -0.03), (19, -0.009), (20, -0.015), (21, 0.009), (22, 0.016), (23, 0.021), (24, 0.134), (25, 0.156), (26, -0.042), (27, 0.054), (28, -0.002), (29, 0.035), (30, 0.07), (31, 0.054), (32, -0.048), (33, -0.129), (34, -0.009), (35, 0.045), (36, -0.197), (37, 0.096), (38, 0.047), (39, 0.077), (40, 0.027), (41, -0.165), (42, 0.24), (43, -0.278), (44, 0.028), (45, -0.073), (46, -0.156), (47, -0.036), (48, -0.14), (49, 0.246)]
simIndex simValue paperId paperTitle
same-paper 1 0.9565686 106 emnlp-2011-Predicting a Scientific Communitys Response to an Article
Author: Dani Yogatama ; Michael Heilman ; Brendan O'Connor ; Chris Dyer ; Bryan R. Routledge ; Noah A. Smith
Abstract: We consider the problem of predicting measurable responses to scientific articles based primarily on their text content. Specifically, we consider papers in two fields (economics and computational linguistics) and make predictions about downloads and within-community citations. Our approach is based on generalized linear models, allowing interpretability; a novel extension that captures first-order temporal effects is also presented. We demonstrate that text features significantly improve accuracy of predictions over metadata features like authors, topical categories, and publication venues.
2 0.54349929 129 emnlp-2011-Structured Sparsity in Structured Prediction
Author: Andre Martins ; Noah Smith ; Mario Figueiredo ; Pedro Aguiar
Abstract: Linear models have enjoyed great success in structured prediction in NLP. While a lot of progress has been made on efficient training with several loss functions, the problem of endowing learners with a mechanism for feature selection is still unsolved. Common approaches employ ad hoc filtering or L1regularization; both ignore the structure of the feature space, preventing practicioners from encoding structural prior knowledge. We fill this gap by adopting regularizers that promote structured sparsity, along with efficient algorithms to handle them. Experiments on three tasks (chunking, entity recognition, and dependency parsing) show gains in performance, compactness, and model interpretability.
3 0.43059731 91 emnlp-2011-Literal and Metaphorical Sense Identification through Concrete and Abstract Context
Author: Peter Turney ; Yair Neuman ; Dan Assaf ; Yohai Cohen
Abstract: Metaphor is ubiquitous in text, even in highly technical text. Correct inference about textual entailment requires computers to distinguish the literal and metaphorical senses of a word. Past work has treated this problem as a classical word sense disambiguation task. In this paper, we take a new approach, based on research in cognitive linguistics that views metaphor as a method for transferring knowledge from a familiar, well-understood, or concrete domain to an unfamiliar, less understood, or more abstract domain. This view leads to the hypothesis that metaphorical word usage is correlated with the degree of abstractness of the word’s context. We introduce an algorithm that uses this hypothesis to classify a word sense in a given context as either literal (de- notative) or metaphorical (connotative). We evaluate this algorithm with a set of adjectivenoun phrases (e.g., in dark comedy, the adjective dark is used metaphorically; in dark hair, it is used literally) and with the TroFi (Trope Finder) Example Base of literal and nonliteral usage for fifty verbs. We achieve state-of-theart performance on both datasets.
4 0.39844748 38 emnlp-2011-Data-Driven Response Generation in Social Media
Author: Alan Ritter ; Colin Cherry ; William B. Dolan
Abstract: Ottawa, Ontario, K1A 0R6 Co l . Cherry@ nrc-cnrc . gc . ca in Redmond, WA 98052 bi l ldol @mi cro so ft . com large corpus of status-response pairs found on Twitter to create a system that responds to Twitter status We present a data-driven approach to generating responses to Twitter status posts, based on phrase-based Statistical Machine Translation. We find that mapping conversational stimuli onto responses is more difficult than translating between languages, due to the wider range of possible responses, the larger fraction of unaligned words/phrases, and the presence of large phrase pairs whose alignment cannot be further decomposed. After addressing these challenges, we compare approaches based on SMT and Information Retrieval in a human evaluation. We show that SMT outperforms IR on this task, and its output is preferred over actual human responses in 15% of cases. As far as we are aware, this is the first work to investigate the use of phrase-based SMT to directly translate a linguistic stimulus into an appropriate response.
5 0.30943817 104 emnlp-2011-Personalized Recommendation of User Comments via Factor Models
Author: Deepak Agarwal ; Bee-Chung Chen ; Bo Pang
Abstract: In recent years, the amount of user-generated opinionated texts (e.g., reviews, user comments) continues to grow at a rapid speed: featured news stories on a major event easily attract thousands of user comments on a popular online News service. How to consume subjective information ofthis volume becomes an interesting and important research question. In contrast to previous work on review analysis that tried to filter or summarize information for a generic average user, we explore a different direction of enabling personalized recommendation of such information. For each user, our task is to rank the comments associated with a given article according to personalized user preference (i.e., whether the user is likely to like or dislike the comment). To this end, we propose a factor model that incorporates rater-comment and rater-author interactions simultaneously in a principled way. Our full model significantly outperforms strong baselines as well as related models that have been considered in previous work.
6 0.30151162 46 emnlp-2011-Efficient Subsampling for Training Complex Language Models
7 0.28204647 19 emnlp-2011-Approximate Scalable Bounded Space Sketch for Large Data NLP
8 0.27143177 110 emnlp-2011-Ranking Human and Machine Summarization Systems
9 0.26977879 12 emnlp-2011-A Weakly-supervised Approach to Argumentative Zoning of Scientific Documents
10 0.2571708 96 emnlp-2011-Multilayer Sequence Labeling
11 0.23500332 84 emnlp-2011-Learning the Information Status of Noun Phrases in Spoken Dialogues
12 0.22626744 48 emnlp-2011-Enhancing Chinese Word Segmentation Using Unlabeled Data
13 0.20715296 86 emnlp-2011-Lexical Co-occurrence, Statistical Significance, and Word Association
14 0.20602798 11 emnlp-2011-A Simple Word Trigger Method for Social Tag Suggestion
15 0.20507374 21 emnlp-2011-Bayesian Checking for Topic Models
16 0.1915355 100 emnlp-2011-Optimal Search for Minimum Error Rate Training
17 0.18614022 79 emnlp-2011-Lateen EM: Unsupervised Training with Multiple Objectives, Applied to Dependency Grammar Induction
18 0.18611787 82 emnlp-2011-Learning Local Content Shift Detectors from Document-level Information
19 0.17373493 143 emnlp-2011-Unsupervised Information Extraction with Distributional Prior Knowledge
20 0.17026681 2 emnlp-2011-A Cascaded Classification Approach to Semantic Head Recognition
topicId topicWeight
[(15, 0.01), (16, 0.352), (23, 0.097), (36, 0.017), (37, 0.026), (45, 0.09), (53, 0.02), (54, 0.024), (57, 0.018), (62, 0.024), (64, 0.032), (66, 0.029), (69, 0.011), (79, 0.041), (80, 0.014), (82, 0.045), (90, 0.014), (96, 0.031), (98, 0.019)]
simIndex simValue paperId paperTitle
same-paper 1 0.73996514 106 emnlp-2011-Predicting a Scientific Communitys Response to an Article
Author: Dani Yogatama ; Michael Heilman ; Brendan O'Connor ; Chris Dyer ; Bryan R. Routledge ; Noah A. Smith
Abstract: We consider the problem of predicting measurable responses to scientific articles based primarily on their text content. Specifically, we consider papers in two fields (economics and computational linguistics) and make predictions about downloads and within-community citations. Our approach is based on generalized linear models, allowing interpretability; a novel extension that captures first-order temporal effects is also presented. We demonstrate that text features significantly improve accuracy of predictions over metadata features like authors, topical categories, and publication venues.
2 0.6375562 23 emnlp-2011-Bootstrapped Named Entity Recognition for Product Attribute Extraction
Author: Duangmanee Putthividhya ; Junling Hu
Abstract: We present a named entity recognition (NER) system for extracting product attributes and values from listing titles. Information extraction from short listing titles present a unique challenge, with the lack of informative context and grammatical structure. In this work, we combine supervised NER with bootstrapping to expand the seed list, and output normalized results. Focusing on listings from eBay’s clothing and shoes categories, our bootstrapped NER system is able to identify new brands corresponding to spelling variants and typographical errors of the known brands, as well as identifying novel brands. Among the top 300 new brands predicted, our system achieves 90.33% precision. To output normalized attribute values, we explore several string comparison algorithms and found n-gram substring matching to work well in practice.
3 0.40759215 107 emnlp-2011-Probabilistic models of similarity in syntactic context
Author: Diarmuid O Seaghdha ; Anna Korhonen
Abstract: This paper investigates novel methods for incorporating syntactic information in probabilistic latent variable models of lexical choice and contextual similarity. The resulting models capture the effects of context on the interpretation of a word and in particular its effect on the appropriateness of replacing that word with a potentially related one. Evaluating our techniques on two datasets, we report performance above the prior state of the art for estimating sentence similarity and ranking lexical substitutes.
4 0.39289767 1 emnlp-2011-A Bayesian Mixture Model for PoS Induction Using Multiple Features
Author: Christos Christodoulopoulos ; Sharon Goldwater ; Mark Steedman
Abstract: In this paper we present a fully unsupervised syntactic class induction system formulated as a Bayesian multinomial mixture model, where each word type is constrained to belong to a single class. By using a mixture model rather than a sequence model (e.g., HMM), we are able to easily add multiple kinds of features, including those at both the type level (morphology features) and token level (context and alignment features, the latter from parallel corpora). Using only context features, our system yields results comparable to state-of-the art, far better than a similar model without the one-class-per-type constraint. Using the additional features provides added benefit, and our final system outperforms the best published results on most of the 25 corpora tested.
5 0.39027992 53 emnlp-2011-Experimental Support for a Categorical Compositional Distributional Model of Meaning
Author: Edward Grefenstette ; Mehrnoosh Sadrzadeh
Abstract: Modelling compositional meaning for sentences using empirical distributional methods has been a challenge for computational linguists. We implement the abstract categorical model of Coecke et al. (2010) using data from the BNC and evaluate it. The implementation is based on unsupervised learning of matrices for relational words and applying them to the vectors of their arguments. The evaluation is based on the word disambiguation task developed by Mitchell and Lapata (2008) for intransitive sentences, and on a similar new experiment designed for transitive sentences. Our model matches the results of its competitors . in the first experiment, and betters them in the second. The general improvement in results with increase in syntactic complexity showcases the compositional power of our model.
6 0.38984606 8 emnlp-2011-A Model of Discourse Predictions in Human Sentence Processing
7 0.38873759 128 emnlp-2011-Structured Relation Discovery using Generative Models
8 0.38847086 37 emnlp-2011-Cross-Cutting Models of Lexical Semantics
9 0.38835213 103 emnlp-2011-Parser Evaluation over Local and Non-Local Deep Dependencies in a Large Corpus
10 0.38813242 108 emnlp-2011-Quasi-Synchronous Phrase Dependency Grammars for Machine Translation
11 0.38761958 28 emnlp-2011-Closing the Loop: Fast, Interactive Semi-Supervised Annotation With Queries on Features and Instances
12 0.38671708 56 emnlp-2011-Exploring Supervised LDA Models for Assigning Attributes to Adjective-Noun Phrases
13 0.38564593 123 emnlp-2011-Soft Dependency Constraints for Reordering in Hierarchical Phrase-Based Translation
14 0.38534653 38 emnlp-2011-Data-Driven Response Generation in Social Media
15 0.38527882 33 emnlp-2011-Cooooooooooooooollllllllllllll!!!!!!!!!!!!!! Using Word Lengthening to Detect Sentiment in Microblogs
16 0.38472426 35 emnlp-2011-Correcting Semantic Collocation Errors with L1-induced Paraphrases
17 0.38458523 119 emnlp-2011-Semantic Topic Models: Combining Word Distributional Statistics and Dictionary Definitions
18 0.3844665 98 emnlp-2011-Named Entity Recognition in Tweets: An Experimental Study
19 0.38442576 91 emnlp-2011-Literal and Metaphorical Sense Identification through Concrete and Abstract Context
20 0.38336122 39 emnlp-2011-Discovering Morphological Paradigms from Plain Text Using a Dirichlet Process Mixture Model